iExpertify https://www.iexpertify.com/ Upskill in 30 days, just 1 hour a day Sat, 29 May 2021 08:57:47 +0000 en-US hourly 1 https://wordpress.org/?v=6.1.1 https://www.iexpertify.com/wp-content/uploads/2020/12/cropped-iExpertify-512x512-icon-32x32.png iExpertify https://www.iexpertify.com/ 32 32 Linux Commands for Beginners https://www.iexpertify.com/linux/linux-commands-for-beginners/ Tue, 15 Nov 2022 08:10:08 +0000 https://www.iexpertify.com/?p=4914 Reading Time: 6 minutes As we know that the Operating System is an important component of the computer. Linux operating system is open-source and community-developed for mainframes, servers. Linux is one of the widely accepted operating systems across the world. Searching Search for a specific pattern in a file with: grep [pattern] [file_name] Recursively search for a pattern in a directory:… Continue reading Linux Commands for Beginners

The post Linux Commands for Beginners appeared first on iExpertify.

]]>
Reading Time: 6 minutes
As we know that the Operating System is an important component of the computer. Linux operating system is open-source and community-developed for mainframes, servers. Linux is one of the widely accepted operating systems across the world.

Searching

Search for a specific pattern in a file with:

grep [pattern] [file_name]

Recursively search for a pattern in a directory:

grep -r [pattern] [directory_name]

Find all files and directories related to a particular name:

locate [name]

List names that begin with a specified character [a] in a specified location [/folder/location] by using the find command:

find [/folder/location] -name [a]

See files larger than a specified size [+100M] in a folder:

find [/folder/location] -size [+100M]

File Commands

List files in the directory:

ls

List all files (shows hidden files):

ls -a

Show directory you are currently working in:

pwd

Create a new directory:

mkdir [directory]

Remove a file:

rm [file_name] 

Remove a directory recursively:

rm -r [directory_name]

Recursively remove a directory without requiring confirmation:

rm -rf [directory_name]

Copy the contents of one file to another file:

cp [file_name1] [file_name2]

Recursively copy the contents of one file to a second file:

cp -r [directory_name1] [directory_name2]

Rename [file_name1] to [file_name2] with the command:

mv [file_name1] [file_name2]

Create a symbolic link to a file:

ln -s /path/to/[file_name] [link_name]

Create a new file:

touch [file_name]

Show the contents of a file:

more [file_name]

or use the cat command: This command used to create and view single or multiple files, and concatenate the files. It will consider one file output as another file input and print in the terminal or file.

cat [file_name]

Append file contents to another file:

cat [file_name1] >> [file_name2]

Display the first 10 lines of a file with:

head [file_name]

Show the last 10 lines of a file:

tail [file_name]

Encrypt a file:

gpg -c [file_name]

Decrypt a file:

gpg [file_name.gpg]

Show the number of words, lines, and bytes in a file:

wc

Hardware Information

Show bootup messages:

dmesg

See CPU information:

cat /proc/cpuinfo

Display free and used memory with:

free -h

List hardware configuration information:

lshw

See information about block devices:

lsblk

Show PCI devices in a tree-like diagram:

lspci -tv

Display USB devices in a tree-like diagram:

lsusb -tv

Show hardware information from the BIOS:

dmidecode

Display disk data information:

hdparm -i /dev/disk

Conduct a read-speed test on device/disk:

hdparm -tT /dev/[device]

Test for unreadable blocks on device/disk:

badblocks -s /dev/[device]

Directory Navigation

Move up one level in the directory tree structure:

cd ..

Change directory to $HOME:

cd

Change location to a specified directory:

cd /chosen/directory

File Compression

Archive an existing file:

tar cf [compressed_file.tar] [file_name]

Extract an archived file:

tar xf [compressed_file.tar]

Create a gzip compressed tar file by running:

tar czf [compressed_file.tar.gz]

Compress a file with the .gz extension:

gzip [file_name]

File Transfer

Copy a file to a server directory securely:

scp [file_name.txt] [server/tmp]

Synchronize the contents of a directory with a backup directory using the rsync command:

rsync -a [/your/directory] [/backup/] 

Users

See details about the active users:

id

Show last system logins:

last

Display who is currently logged into the system with the who command:

who

Show which users are logged in and their activity:

w

Add a new group by typing:

groupadd [group_name]

Add a new user:

adduser [user_name]

Add a user to a group:

usermod -aG [group_name] [user_name]

Temporarily elevate user privileges to superuser or root using the sudo command:

sudo [command_to_be_executed_as_superuser]

Delete a user:

userdel [user_name] 

Modify user information with:

usermod

Package Installation

List all installed packages with yum:

yum list installed

Find a package by a related keyword:

yum search [keyword]

Show package information and summary:

yum info [package_name]

Install a package using the YUM package manager:

yum install [package_name.rpm]

Install a package using the DNF package manager:

dnf install [package_name.rpm]

Install a package using the APT package manager:

apt-get install [package_name]

Install an .rpm package from a local file:

rpm -i  [package_name.rpm]

Remove an .rpm package:

rpm -e [package_name.rpm]

Install software from source code:

tar zxvf [source_code.tar.gz]
cd [source_code]
./configure
make
make install

Process Related

See a snapshot of active processes:

ps

Show processes in a tree-like diagram:

pstree

Display a memory usage map of processes:

pmap

See all running processes:

top

Terminate a Linux process under a given ID:

kill [process_id]

Terminate a process under a specific name:

pkill [proc_name]

Terminate all processes labelled “proc”:

killall [proc_name]

List and resume stopped jobs in the background:

bg

Bring the most recently suspended job to the foreground:

fg

Bring a particular job to the foreground:

fg [job]

List files opened by running processes:

lsof

Network

List IP addresses and network interfaces:

ip addr show

Assign an IP address to interface eth0:

ip address add [IP_address]

Display IP addresses of all network interfaces with:

ifconfig

See active (listening) ports with the netstat command:

netstat -pnltu

Show tcp and udp ports and their programs:

netstat -nutlp

Display more information about a domain:

whois [domain]

Show DNS information about a domain using the dig command:

dig [domain] 

Do a reverse lookup on domain:

dig -x host

Do reverse lookup of an IP address:

dig -x [ip_address]

Perform an IP lookup for a domain:

host [domain]

Show the local IP address:

hostname -I

Download a file from a domain using the wget command:

wget [file_name]

Linux Keyboard Shortcuts

Kill process running in the terminal:

Ctrl + C

Stop current process:

Ctrl + Z

The process can be resumed in the foreground with fg or in the background with bg.

Cut one word before the cursor and add it to clipboard:

Ctrl + W

Cut part of the line before the cursor and add it to clipboard:

Ctrl + U

Cut part of the line after the cursor and add it to clipboard:

Ctrl + K

Paste from clipboard:

Ctrl + Y

Recall last command that matches the provided characters:

Ctrl + R

Run the previously recalled command:

Ctrl + O

Exit command history without running a command:

Ctrl + G

Run the last command again:

!!

Log out of current session:

exit

File Permission

Chown command in Linux changes file and directory ownership.

Assign read, write, and execute permission to everyone:

chmod 777 [file_name]

Give read, write, and execute permission to owner, and read and execute permission to group and others:

chmod 755 [file_name]

Assign full permission to owner, and read and write permission to group and others:

chmod 766 [file_name]

Change the ownership of a file:

chown [user] [file_name]

Change the owner and group ownership of a file:

chown [user]:[group] [file_name]

Disk Usage

You can use the df and du commands to check disk space in Linux.

See free and used space on mounted systems:

df -h

Show free inodes on mounted filesystems:

df -i

Display disk partitions, sizes, and types with the command:

fdisk -l

See disk usage for all files and directory:

du -ah

Show disk usage of the directory you are currently in:

du -sh

Display target mount point for all filesystem:

findmnt

Mount a device:

mount [device_path] [mount_point]

SSH Login

Connect to host as user:
ssh user@host

Securely connect to host via SSH default port 22:

ssh host

Connect to host using a particular port:

ssh -p [port] user@host

Connect to host via telnet default port 23:

telnet host

The post Linux Commands for Beginners appeared first on iExpertify.

]]>
Linux File Permissions -chmod https://www.iexpertify.com/linux/linux-file-permissions/ Thu, 16 Jun 2022 13:50:33 +0000 https://www.iexpertify.com/?p=4826 Reading Time: 7 minutes Linux is a multi-operating system that can be accessed through numerous users concurrently. So, it has to be authenticated to ensure people from accessing other confidential files. Linux can also be used in mainframes and servers without any modifications. It uses the concept of ownership and permissions to enhance the security of the directories and… Continue reading Linux File Permissions -chmod

The post Linux File Permissions -chmod appeared first on iExpertify.

]]>
Reading Time: 7 minutes

Linux is a multi-operating system that can be accessed through numerous users concurrently. So, it has to be authenticated to ensure people from accessing other confidential files. Linux can also be used in mainframes and servers without any modifications. It uses the concept of ownership and permissions to enhance the security of the directories and files. 

Linux File Permissions

Types of permissions

Every directories and file in Linux have three basic permission types. They are discussed as follows:

#1 Read Permission

The read permission enables you to open and read a file. For a directory, the read permission enables the user to list the contents of the directory.

#2 Write Permission

The write permission allows the user to modify the file and write new data to the file. For a directory, the write permission allows the user to modify the content of the directory. The user can add, remove, or rename files that belong to a particular directory. 

#3 Execute Permissions

The execute permission allows the user to execute the file as a shell script or a program. For the directory, the execute permission enables the user to access the files in a directory and enter it by using the cd command but it does not allow to list the content.

Viewing the permissions

The view permission allows the user to check the directory or file in the GUI file manager or by reviewing the output using the command given below.

ls -l 

Permissions groups

Every directory and file on Linux is owned by a specific user and group and are defined separately as three user based permission groups. They are as follows:

# User

A user is a person who owns the directory or file. By default, the user who creates the file or directory will be the owner. 

# Group

The user group that owns the directory or file will not affect the actions of other users. All the users who belong to the group that owns the directory or file will have the same permission to access the file or directory.

# Other

The user who is not the owner of the directory or file and doesn’t belong to the same group of the directory or file. Simply, if we set the permission for the ‘other’ category, by default it will affect everyone. 

If you want to view the users on the system, you can view the user using the command below:

cat /etc/passwd

Similarly, you can view the group on the system by using the command below:

cat /etc/group

-rw-rw-r– is a code that represents the permissions given to the owner, user group, and the world.

Here, the ‘-’ represents the selected file. For the directory, it is denoted as ‘d’. 

The characters are simple to remember and understand.

  • r- read permission
  • w- write permission
  • x- execute permission
  • _- no permission

The first part of the code ‘rw-’ represents the owner can read the file, write the file, but cannot execute the file since the execute bit is set to ‘-’. Several Linux distributions such as CentOS, Ubuntu, Fedora, etc. will add users to the group of the same group name as the username. 

The second part of the code ‘rw-’ represents for the user group and group members can read the file, write the file.

The third part of the code ‘r–’ represents any user and the user can only read the file.

Changing file permission using chmod

With the help of the change mode ‘chmod’ command, we can set the permissions such as read, write, and execute on a directory or file for the owner, user, and the group.

chmod <permission-number> <file-name>

Here, the permission number is calculated by using the assigned values for r, w, and x. The basic permission number includes three digits. Some special cases can use four digits as a permission number. 

There are two ways to use the commands. They are as follows: 

1. Numeric mode

In a numeric mode, file permissions do not denote as characters but as a three-digit octal number. The following table provides the numbers for all permission types.

NumberCharacter of SymbolPermission Type
0No permission
1–xExecute
2-w-Write
3-wxWrite+Execute
4r–Read
5r-xRead+Execute
6rw-Read+Write
7rwxRead+Write+Execute

For example, see the command below.

$ chmod 764 sample

ls -l sample

rwxrw-r-- 1 iExpertify iExpertify 20 June 20 06:00 sample

chmod 764 and checking permission. In the above command, we have changed the file permissions to 764. 764 represents the following:

  • The owner can read, write, and execute
  • The user group can read and write
  • Any user can only read

2. Symbolic mode

In this mode, we can change permissions for all three owners. We can modify the permissions of a specific owner. With the use of mathematical symbols, we can modify the file permissions.

OperatorDescription
+Adds permission to access directory or files
Removes permissions
=Sets permission and overrides the permissions set earlier

Let’s see the example:

Current file

ls -l sample

-rw-rw-r-- 1 iExpertify iExpertify 22 2020-06-29 13:45 sample?

Setting permission to other users

$ chmod 0=rwx sample

$ ls -l sample

-rw-rw-rwx 1 iExpertify iExpertify 22 2020-06-29 13:45 sample

Adding execute permission to the user group

$ chmod g+x sample

$ ls -l sample

-rw-rwxrwx 1 iExpertify iExpertify 22 2020-06-29 13:45 sample?

Removing read permission for the user

$ chmod u-r sample

$ ls -l sample

--w-rwxrwx 1 iExpertify iExpertify 22 2020-06-29 13:45 sample

Changing ownership and group

For changing the ownership of a directory or file, use the command below:

chown user

If you want to change the user along with the group for a directory or file, use the command below

chown user: group filename

If you wish to change group owner only, use the command below

chgrp group_name filename

Here, chgrp represents for change group

Advanced permissions

The special permissions that are used to access directories or files are as following:

  • _ – It represents there are no special permissions.
  • d- It represents the directory.
  • l-  It represents the symbolic link of a directory or a file.
  • t- It represents the sticky bit permissions. It represents ‘t’ in the executable portion of all user permissions.
  • s- It indicates the setuid or setgid permissions. It represents ‘s’ in the read portion of the owner or group permissions.

Setuid or setgid special permissions

The setuid or setgid permissions are used to assign the system to run an executable as the owner with the owner’s permissions. We can assign this permission by explicit defining permissions.

The character that represents the setuid or setgid is ‘s’. To set the setuid or setgid bit on file1.sh, use the command below:

chmod g+s file1.sh

Be careful while using setuid or setgid permissions. If you assign the permissions incorrectly, then your system goes to intrusion.

Sticky bit special permissions

The sticky bit can be useful in a shared environment because when it is assigned to the permissions on a directory it sets permissions for the file owner only to rename or delete the file.

The character for the sticky bits is ‘t’. To set the sticky bits on a directory, use the command below:

chmod+t dir1

Tip

  • The file /etc/group contains all the groups defined in the system
  • You can use the command “groups” to find all the groups you are a member of
  • You can use the command newgrp to work as a member a group other than your default group
  • You cannot have 2 groups owning the same file.
  • You do not have nested groups in Linux. One group cannot be sub-group of other
  • x- eXecuting a directory means Being allowed to “enter” a dir and gain possible access to sub-dirs
  • There are other permissions that you can set on Files and Directories which will be covered in a later advanced tutorial

Summary:

  • Linux being a multi-user system uses permissions and ownership for security.
  • There are three user types on a Linux system viz. User, Group and Other
  • Linux divides the file permissions into read, write and execute denoted by r,w, and x
  • The permissions on a file can be changed by ‘chmod’ command which can be further divided into Absolute and Symbolic mode
  • The ‘chown’ command can change the ownership of a file/directory. Use the following commands: chown user file or chown user:group file
  • The ‘chgrp’ command can change the group ownership chrgrp group filename
  • What does x – eXecuting a directory mean? A: Being allowed to “enter” a dir and gain possible access to sub-dirs.

The post Linux File Permissions -chmod appeared first on iExpertify.

]]>
Grep Regex Example Linux https://www.iexpertify.com/linux/grep_regex_example_linux/ Tue, 21 Dec 2021 21:12:35 +0000 https://www.iexpertify.com/?p=3262 Reading Time: 4 minutes What are Regular Expressions? Regular expressions are special characters which help search data, matching complex patterns. Regular expressions are shortened as 'regexp' or 'regex'. Types of Regular expressions For ease of understanding let us learn the different types of Regex one by one. Basic Regular expressions Interval Regular expressions Extended regular expressions Summary Click here if the video is not accessible Basic Regular expressions Some of the commonly used commands with Regular expressions are tr, sed, vi and grep. Listed below are some of the basic Regex. Symbol Descriptions . replaces any character ^ matches start of string $ matches end of string * matches up zero or more times the preceding character \ Represent special characters () Groups regular expressions ? Matches up exactly one character Let's see an example. Execute cat sample to see contents of an existing file Search for content containing letter 'a'. '^' matches the start of a string. Let's search for content that STARTS with a Only lines that start with character are filtered. Lines which do not contain the character 'a' at the start are ignored. Let's look into another example - Select only those lines that end with t using $ Interval Regular expressions These expressions tell us about the number of occurrences of a character in a string. They are Expression Description {n} Matches the preceding character appearing 'n' times exactly {n,m} Matches the preceding character appearing 'n' times but not more than m {n, } Matches the preceding character only when it appears 'n' times or more Example: Filter out all lines that contain character 'p' We want to check that the character 'p' appears exactly 2 times in a string one after the other. For this the syntax would be: cat sample | grep -E p\{2} Note: You need to add -E with these regular expressions. Extended regular expressions These regular expressions contain combinations of more than one expression. Some of them are: Expression Description \+ Matches one or more occurrence of the previous character \? Matches zero or one occurrence of the previous character Example: Searching for all characters 't' Suppose we want to filter out lines where character 'a' precedes character 't' We can use command like cat sample|grep "a\+t" Brace expansion The syntax for brace expansion is either a sequence or a comma separated list of items inside curly braces "{}". The starting and ending items in a sequence are separated by two periods "..". Some examples: In the above examples, the echo command creates strings using the brace expansion. Summary: Regular expressions are a set of characters used to check patterns in strings They are also called 'regexp' and 'regex' It is important to learn regular expressions for writing scripts Some basic regular expressions are: Symbol Descriptions . replaces any character ^ matches start of string $ matches end of string Some extended regular expressions are: Expression Description \+ Matches one or more occurrence of the previous character \? Matches zero or one occurrence of the previous character Some interval regular expressions are: Expression Description {n} Matches the preceding character appearing 'n' times exactly {n,m} Matches the preceding character appearing 'n' times but not more than m {n, } Matches the preceding character only when it appears 'n' times or more The brace expansion is used to generate strings. It helps in creating multiple strings out of one.  

The post Grep Regex Example Linux appeared first on iExpertify.

]]>
Reading Time: 4 minutes

What are Regular Expressions?

Grep Regex is one of the most popular command-line utilities to find and search strings in a text file.

Using the grep command with regular expressions makes it even more powerful.

Regular expressions come in the picture when you want to search for a text containing a particular pattern.

It simplifies your search operation by searching the patterns on each line of the file.

Types of Regular expressions

For ease of understanding let us learn the different types of Regex one by one.

Basic Regular expressions

Some of the commonly used commands with Regular expressions are tr, sed, vi and grep. Listed below are some of the basic Regex.

SymbolDescriptions
.replaces any character (single character)
^matches start of string
$matches end of string
*matches up zero or more times the preceding character
\Represent special characters
()Groups regular expressions
?Matches up exactly one character

Let’s create a sample test.txt file with the following content:

cat test.txt

Output:

This record ends in 2021
2021 is the start of the record

2021
2020

This record ends in 2020
2020 is the start of the record

For example, find all the lines which end with the word 2021:

grep "2021$" test.txt

You should see the following output:

This record ends in 2021

2021

Next, find all the lines which start and end with the word 2021:

grep "^2021$" test.txt

You should see the following output:

2021

For example, find all the lines which end with the word 2020 or 2021(The dot allows any single character in the place):

grep "202.$" test.txt

You should see the following output:

This record ends in 2021

2021

This record ends in 2020

2020

Now, lets display all the lines that start with the string balaram:

grep "^2021" test.txt

You should see the following output:

2021 is the start of the record
2021

Next, find the number of blank lines in the file test.txt:

grep "^$" test.txt

You should see the following output:

<3 empty lines>

Interval Regular expressions

These expressions tell us about the number of occurrences of a character in a string. They are

ExpressionDescription
{n}Matches the preceding character appearing ‘n’ times exactly
{n,m}Matches the preceding character appearing ‘n’ times but not more than m
{n, }Matches the preceding character only when it appears ‘n’ times or more

Example:

Filter out all lines that contain character ‘p’

We want to check that the character ‘p’ appears exactly 2 times in a string one after the other. For this the syntax would be:

cat sample | grep -E p\{2}

Note: You need to add -E with these regular expressions.

Extended regular expressions

These regular expressions contain combinations of more than one expression. Some of them are:

ExpressionDescription
\+Matches one or more occurrence of the previous character
\?Matches zero or one occurrence of the previous character

Example:

Searching for all characters ‘t’

Suppose we want to filter out lines where character ‘a’ precedes character ‘t’

We can use command like

cat sample|grep "a\+t"

The regular expression (\) used to search for special characters.

Let’s create a sample test.txt file with the following contents:

cat test.txt

Output:

1.1.1.1
1a1a1a1
1b1c1d1

Now, search for all the lines which matches the pattern “1.1.1.1“:

grep "1.1.1.1" test.txt

This command does not show the proper result as “.” matches any single character:

1.1.1.1
1a1a1a1
1b1c1d1

You can use the regular expression “\” to resolve this issue:

grep "1\.1\.1\.1" test.txt

Output:

1.1.1.1

Brace expansion

The syntax for brace expansion is either a sequence or a comma separated list of items inside curly braces “{}”. The starting and ending items in a sequence are separated by two periods “..”.

Some examples:

In the above examples, the echo command creates strings using the brace expansion.

Let’s create a sample test.txt file with the following contents:

cat test.txt

Output:

apple
appple
appppple

Now, search for all the lines which match a character “p” two times:

grep -E "ap{2}l" test.txt

You should see the following output:

apple

Next, search for all the lines which match a character “p” two or more times:

grep -E "ap{2,}l" test.txt

You should see the following output:

apple
appple
appppple

Next, search for all the lines which match a character “p” two or three times:

grep -E "ap{2,3}l" test.txt

You should see the following output:

apple
appple

Square bracket expansion

The regular expression [] can be used to match any one character found within the bracket group.

search for all the lines which match any range character found within the “test” group.

grep "test[x-z]" test.txt

You should see the following output:

testx
testy
testz

The post Grep Regex Example Linux appeared first on iExpertify.

]]>
What is Data Modeling? https://www.iexpertify.com/data-warehousing/what-is-data-modeling-2/ Tue, 30 Nov 2021 18:32:54 +0000 https://www.iexpertify.com/?p=10054 Reading Time: 6 minutes What is Data Modeling? The method of creating a data store model is called data processing in a database. Data modeling is the process of identifying the entities in our domain, the relationships between these entities and how they will be stored in the database. This introduces theoretical data objects and connections between different data… Continue reading What is Data Modeling?

The post What is Data Modeling? appeared first on iExpertify.

]]>
Reading Time: 6 minutes

What is Data Modeling?

The method of creating a data store model is called data processing in a database. Data modeling is the process of identifying the entities in our domain, the relationships between these entities and how they will be stored in the database. This introduces theoretical data objects and connections between different data objects. Data modelling is a data formulation process in a standardized format in an information system. It helps to quickly analyze data, which helps to meet business needs. The data modelling process requires data modelers who work correctly with stakeholders and prospective IT users. Data modelling ends with the development of a model of data supporting the infrastructure for the business information system

Understanding Data Modeling / Scope

It occurs at three different layers:

  • Physical model: It is a schema which says how data is stored physically in the database
  • Conceptual model: It is the user view of the data i.e. the high level which the user sees.
  • Logical model: It sits between the Physical model and conceptual model and it represents the data logically, separate from its physical stores.

Hierarchical Data Modeling: These models were used to replace file-based systems. The data was kept in a tree like one too many arrangements.

Relational Data Modeling: It’s true that the hierarchical model helped us to move from file-based systems which reduced complexity but still one had known the specific physical data storage employed. The relational database follows the relational model where data is stored in tables, unlike Hierarchical database where it is stored in a tree-like structure. In short, it reduced the complexity more when compared to the hierarchical model.

How does Data Modeling make work so easy/why should we use it?

It helps us in a visual representation of data and enforces business logic, regulations, policies, etc on data. It is a guide which is used by scientists and analysts in the designing and implementation of a database. So, without data modeling the job of analysts and scientists to implement the business requirements on database becomes difficult.

  • Data modeling is query based, that is, we think of the application workflow and the queries early on in the data model process
  • A Table is how databases stores data and can be thought of as a set of rows and columns
  • After designing the entities (or tables) that we need, we decide how the tables would be related to each other
  • The next thought process is to define the primary and foreign keys for the tables. The primary key uniquely identifies a single record on the table. A foreign key is used to relate to other tables within the data model.
  • For operational databases/transactional systems, the goal of data model is to reduce any duplication so that we don’t store the same data in multiple places.
  • For dimensional models, data warehouses, data marts, etc, the goal is to optimize the select query. Duplication of data is not considered an issue. We do not optimize for writes as writes are considered cheaper than select statements. so, we design the database to solve all the fields that are needed in a select query the BI tools run, the report that gets generated or the model that the data scientists need to work with.
  • One of our goals in data modeling for data warehouses is even data distribution. As the data is dispersed across nodes (servers), the optimal distribution of data so as to query within one node or query disperses equal load across all nodes may be considered based on what database we are working with (Say, it depends on Teradata vs Netezza)
  • Selecting the Primary Key is very important and has a huge impact on query performance

Why do we need Data Modeling? / What can you do with it?

The main goal of using it is:

  • To ensure that all data objects are represented correctly as if it is not done correctly we would get incorrect results.
  • It helps as stated earlier to design database at conceptual, physical and logical levels.
  • It helps to design the relational tables, primary keys, foreign keys, etc.
  • Database developers can create a better physical database with a good model as it becomes a guiding tool for them.
  • It helps to identify missing and redundant data.
  • It helps us to have a better IT infrastructure and to have easy and cheap maintenance when required in the long run though it’s time-consuming initially.

Working

Now let’s create a sample data model to understand how to work with a model. To do this we have to follow certain steps:

  • First we have to understand the requirements, In this case, we will create a model for an online store. So, keeping that in mind we need two tables a) customers b) products
  • Next step is to get the attributes of the tables or entities

a. customer table can have attributes like:

  • Id
  • Name
  • Email
  • Address

b. Product table can have attributes like:

  • Id
  • Name

In the customer table, we can have Id as Primary key and similarly Product Id in Product table will be the primary key.

Now, we will design the relationship between these two tables. So to connect the customer and product table we will create a table called purchase which will be like an order table (i.e. which customer ordered which product).

Who is the right audience for learning this technology?

It is very essential. The right audiences for learning modeling techniques are individuals who are data architects and data analysts. Most individuals start as data analysts and then move up the ladder.

How this technology will help you in career growth?

According to Glassdoor, the average salary in the market for modelers is projected to earn about $78,601 on an average. So you can see that it is a well-paid job. Most big companies invest in modelers as they are very essential for keeping the integrity of data.

Conclusion

In conclusion, we can say that the model created by modelers ensure consistency in naming conventions, integrity, and security of data. because good data will enable the business in the correct efficient utilization of their data.

The post What is Data Modeling? appeared first on iExpertify.

]]>
Input Redirection in Linux/Unix Examples https://www.iexpertify.com/linux/input-output-redirection-in-linux-unix-examples/ Sun, 10 Oct 2021 23:10:36 +0000 https://www.iexpertify.com/?p=3260 Reading Time: 2 minutes What is Redirection?Redirection is a feature in Linux such that when executing a command, you can change the standard input/output devices. The basic workflow of any Linux command is that it takes an input and give an output. The standard input (stdin) device is the keyboard. The standard output (stdout) device is the screen. With redirection, the above standard input/output can be changed. In this tutorial, we will learn- Output Redirection Input redirection File Descriptors (FD) Error Redirection Why Error Redirection? Examples Click here if the video is not accessible Output Redirection The '>' symbol is used for output (STDOUT) redirection. Example: ls -al > listings Here the output of command ls -al is re-directed to file "listings" instead of your screen. Note: Use the correct file name while redirecting command output to a file. If there is an existing file with the same name, the redirected command will delete the contents of that file and then it may be overwritten." If you do not want a file to be overwritten but want to add more content to an existing file, then you should use '>>' operator. You can redirect standard output, to not just files, but also devices! $ cat music.mp3 > /dev/audio The cat command reads the file music.mp3 and sends the output to /dev/audio which is the audio device. If the sound configurations in your PC are correct, this command will play the file music.mp3 Input redirectionThe '<' symbol is used for input(STDIN) redirection Example: The mail program in Linux can help you send emails from the Terminal. You can type the contents of the email using the standard device keyboard. But if you want to attach a File to email you can use the input re-direction operator in the following format. Mail -s "Subject" to-address < Filename This would attach the file with the email, and it would be sent to the recipient. The above examples were simple. Let's look at some advance re-direction techniques which make use of File Descriptors. File Descriptors (FD) In Linux/Unix, everything is a file. Regular file, Directories, and even Devices are files. Every File has an associated number called File Descriptor (FD). Your screen also has a File Descriptor. When a program is executed the output is sent to File Descriptor of the screen, and you see program output on your monitor. If the output is sent to File Descriptor of the printer, the program output would have been printed. Error Redirection Whenever you execute a program/command at the terminal, 3 files are always open, viz., standard input, standard output, standard error. These files are always present whenever a program is run. As explained before a file descriptor, is associated with each of these files. File File Descriptor Standard Input STDIN 0 Standard Output STDOUT 1 Standard Error STDERR 2 By default, error stream is displayed on the screen. Error redirection is routing the errors to a file other than the screen. Why Error Redirection? Error re-direction is one of the very popular features of Unix/Linux. Frequent UNIX users will reckon that many commands give you massive amounts of errors. For instance, while searching for files, one typically gets permission denied errors. These errors usually do not help the person searching for a particular file. While executing shell scripts, you often do NOT want error messages cluttering up the normal program output. The solution is to re-direct the error messages to a file. Example 1 $ myprogram 2>errorsfile Above we are executing a program names myprogram. The file descriptor for standard error is 2. Using "2>" we re-direct the error output to a file named "errorfile" Thus, program output is not cluttered with errors. Example 2 Here is another example which uses find statement - find . -name 'my*' 2>error.log Using the "find" command, we are searching the "." current directory for a file with "name" starting with "my" Example 3 Let's see a more complex example, Server Administrators frequently, list directories and store both error and standard output into a file, which can be processed later. Here is the command. ls Documents ABC> dirlist 2>&1 Here, which writes the output from one file to the input of another file. 2>&1 means that STDERR redirects to the target of STDOUT (which is the file dirlist) We are redirecting error output to standard output which in turn is being re-directed to file dirlist. Hence, both the output is written to file dirlist Summary Each file in Linux has a corresponding File Descriptor associated with it The keyboard is the standard input device while your screen is the standard output device ">" is the output redirection operator. ">>" appends output to an existing file "<" is the input redirection operator ">&"re-directs output of one file to another. You can re-direct error using its corresponding File Descriptor 2.  

The post Input Redirection in Linux/Unix Examples appeared first on iExpertify.

]]>
Reading Time: 2 minutes

Just as the output of a command can be redirected to a file, so can the input of a command be redirected from a file. As the greater-than character > is used for output redirection, the less-than character < is used to redirect the input of a command.

We may want a file to be the input for a command that normally wouldn’t accept a file as an option. This redirecting of input is done using the “<” (less-than symbol) operator.

Below is an example of sending a file to somebody, using input redirection.

> mail john.dba@organization.com < sample_ddl

This reads a bit more difficult than the beginner’s cat file | mail someone, but it is of course a much more elegant way of using the available tools.

File Descriptors (FD)

In Linux/Unix, everything is a file. Regular file, Directories, and even Devices are files. Every File has an associated number called File Descriptor (FD).

Your screen also has a File Descriptor. When a program is executed the output is sent to File Descriptor of the screen, and you see program output on your monitor. If the output is sent to File Descriptor of the printer, the program output would have been printed.

Whenever you execute a program/command at the terminal, 3 files are always open, viz., standard input, standard output, standard error.

FileFile Descriptor
Standard Input STDIN0
Standard Output STDOUT1
Standard Error STDERR2

Following is a complete list of commands which you can use for redirection −

Sr.No.Command & Description
1cmd > file Output of cmd is redirected to file
2cmd < file Program cmd reads its input from file
3cmd >> file Output of cmd is appended to file
4n > file Output from stream with descriptor n redirected to file
5n >> file Output from stream with descriptor n appended to file
6n >& m Merges output from stream n with stream m
7n <& m Merges input from stream n with stream m
8<< tag Standard input comes from here through next tag at the start of line
9| Takes output from one program, or process, and sends it to another

The post Input Redirection in Linux/Unix Examples appeared first on iExpertify.

]]>
Snowflake Interview Questions https://www.iexpertify.com/data-warehousing/snowflake-interview-questions/ Sat, 03 Jul 2021 16:44:02 +0000 https://www.iexpertify.com/?p=4778 Reading Time: 6 minutes Q) What is a Snowflake cloud data warehouse? Ans. Snowflake is an analytic data warehouse implemented as a SaaS service. It is built on a new SQL database engine with a unique architecture built for the cloud. This cloud-based data warehouse solution was first available on AWS as software to load and analyze massive volumes… Continue reading Snowflake Interview Questions

The post Snowflake Interview Questions appeared first on iExpertify.

]]>
Reading Time: 6 minutes

Q) What is a Snowflake cloud data warehouse?

Ans. Snowflake is an analytic data warehouse implemented as a SaaS service. It is built on a new SQL database engine with a unique architecture built for the cloud. This cloud-based data warehouse solution was first available on AWS as software to load and analyze massive volumes of data. The most remarkable feature of Snowflake is its ability to spin up any number of virtual warehouses, that means the user can operate an unlimited number of independent workloads against the same data without any risk of contention.

Q) What is Unique about Snowflake Cloud Data Warehouse?

Ans. Snowflake is cloud native (built for the cloud).So, It takes advantage of all the good things about the cloud and brings exciting new features like,

  • Auto scaling
  • Zero copy cloning
  • Dedicated virtual warehouses
  • Time travel
  • Military grade encryption and security
  • Robust data protection features

Snowflake is a poetry. It’s beautifully crafted with smart defaults –

  • All the data is compressed by default
  • All the data is encrypted
  • Its Columnar, thereby making the column level analytical operations a lot faster

Not to mention the number of innovations in the product – eg. Intelligent Services layer, data shares, tasks & streams. Snowflake also has a simple and transparent pricing, which makes it very easier even for smaller businesses to afford a cloud datawarehouse


Snowflake Full Course

Get Full course here

Q) How is data stored in Snowflake?

Ans. Snowflakes store the data in multiple micro partitions which are internally optimized and compressed. The data is stored in a columnar format in the cloud storage of Snowflake. The data objects stored by Snowflake cannot be accessed or visible to the users. By running SQL query operations on Snowflake, you can access them.

Q4) What type of database is Snowflake?

Ans. Snowflake is built entirely on a SQL database. It’s a columnar-stored relational database that works well with Excel, Tableau, and many other tools. Snowflake contains its query tool, supports multi-statement transactions, role-based security, etc., which are expected in a SQL database.

Q) What is a Columnar database and what are its benefits?

Ans. Columnar databases organize data at Column level instead of the conventional row level. All Column level operations will be much faster and consume less resources when compared to a row level relational database

Q) What are the different ways to access the Snowflake Cloud Datawarehouse?

Ans. You can access the Snowflake Data Warehouse using

  • Web User Interface
  • ODBC Drivers
  • JDBC Drivers
  • SnowSQL Command line Client
  • Python Libraries

Q) Can AWS glue connect to Snowflake? 

Ans. Definitely. AWS glue presents a comprehensive managed environment that easily connects with Snowflake as data warehouse service. These two solutions collectively enable you to handle data ingestion and transformation with more ease and flexibility.

Q) Explain Snowflake editions.

Ans. Snowflake offers multiple editions depending on your usage requirements.

  • Standard edition – Its introductory level offering provides unlimited access to Snowflake’s standard features.
  • Enterprise edition – Along with Standard edition features and services, offers additional features required for the large-scale enterprises.
  • Business-critical edition – Also, called Enterprise for Sensitive Data (ESD). It offers high-level data protection for sensitive data to organization needs.
  • Virtual Private Snowflake (VPS) – Provides high-level security for organizations dealing with financial activities.

Q) What is Snowflake Caching?

Ans. Imagine executing a query that takes 10 minutes to complete. Now if you re-run the same query later in the day while the underlying data hasn’t changed, you are essentially doing again the same work and wasting resources

Instead Snowflake caches the results of every query you ran and when a new query is submitted, it checks previously executed queries and if a matching query exists and the results are still cached, it uses the cached result set instead of executing the query. This can greatly reduce query times because Snowflake retrieves the result directly from the cache.

Q) Define the Snowflake Cluster.

Ans. In Snowflake, data partitioning is called clustering, that specifies cluster keys on the table. The method by which you manage clustered data in a table is called re-clustering.

Q) Explain Snowflake architecture.

Ans. Snowflake is built on a patented, multi-cluster, shared data architecture created for the cloud. Snowflake architecture is comprised of storage, compute, and services layers that are logically integrated but scale infinitely and independent from one another. Snowflake is built on AWS/Azure/GCP cloud data warehouse and is truly Saas offering. There is no software, hardware, ongoing maintenance, tuning, etc. needed to work with Snowflake.

Three main layers make the Snowflake architecture – database storage, query processing, and cloud services.

  • Data storage – In Snowflake, the stored data is reorganized into its internal optimized, columnar, and optimized format. 
  • Query processing – Virtual warehouses process the queries in Snowflake.
  • Cloud services – This layer coordinates and handles all activities across the Snowflake. It provides the best results for Authentication, Metadata management, Infrastructure management, Access control, and Query parsing.

Q) What are the features of Snowflake? 

Ans. Unique features of the Snowflake data warehouse are listed below:

  • Database and Object Closing
  • Support for XML
  • External tables
  • Hive metastore integration
  • Supports geospatial data
  • Security and data protection
  • Data sharing
  • Search optimization service
  • Table streams on external tables and shared tables
  • Result Caching

Q) What is Time Travel in Snowflake?

Ans. Time travel is a cool feature which lets you access data as of any time in the past. For example, if you have an Employee table and if you delete the table accidentally you can use time travel and go back 5 minutes and retrieve the data back. Snowflake Time Travel enables accessing historical data (i.e., data that has been changed or deleted) at any point within a defined period. It serves as a powerful tool for performing the following tasks:

  • Query data in the past that has since been updated or deleted
  • Create clones of entire tables, schemas, and databases at or before specific points in the past
  • Restore tables, schemas, and databases that have been dropped

Q) Tell me something about Snowflake AWS?

Ans. For managing today’s data analytics, companies rely on a data platform which offers rapid deployment, compelling performance, and on-demand scalability. Snowflake on the AWS platform serves as a SQL data warehouse, which makes modern data warehousing effective, manageable, and accessible to all data users. It enables the data-driven enterprise with secure data sharing, elasticity and per-second pricing.

Q) Describe Snowflake computing. 

Ans. Snowflake cloud data warehouse platform provides instant, secure, and governed access to the entire data network and a core architecture to enable various types of data workloads, including a single platform for developing modern data applications.  

Q) What is the schema in Snowflake?

Ans. Schemas and databases used for organizing data stored in the Snowflake. A schema is a logical grouping of database objects such as tables, views, etc. The benefits of using Snowflake schemas are it provides structured data and uses small disk space.

Q) What kind of SQL does Snowflake use?

Ans. Snowflake supports the most common standardized version of SQL, i.e., ANSI for powerful relational database querying.

Q) What are the cloud platforms currently supported by Snowflake?

Ans. 

  • Amazon Web Services (AWS)
  • Google Cloud Platform (GCP)
  • Microsoft Azure (Azure)

Q) What ETL tools do you use with Snowflake?

Ans. Following are the best ETL tools for Snowflake:

  • Matillion
  • Blendo
  • Hevo Data
  • StreamSets
  • Etleap
  • Apache Airflow 

The post Snowflake Interview Questions appeared first on iExpertify.

]]>
MOLAP, ROLAP and HOLAP https://www.iexpertify.com/data-warehousing/what-is-molap-architecture-advantages-example-tools/ Fri, 14 May 2021 03:56:08 +0000 https://www.iexpertify.com/?p=2872 Reading Time: 4 minutes What is MOLAP?Multidimensional OLAP (MOLAP) is a classical OLAP that facilitates data analysis by using a multidimensional data cube. Data is pre-computed,pre-summarized, and stored in a MOLAP (a major difference from ROLAP). Using a MOLAP, a user can use multidimensional view data with different facets. Multidimensional data analysis is also possible if a relational database is used. By that would require querying data from multiple tables. On the contrary, MOLAP has all possible combinations of data already stored in a multidimensional array. MOLAP can access this data directly. Hence, MOLAP is faster compared to Relational Online Analytical Processing (ROLAP). In this tutorial, you will learn- What is MOLAP? MOLAP Architecture Implementation considerations is MOLAP Molap Advantages Molap Disadvantages MOLAP Tools Key Points In MOLAP, operations are called processing.MOLAP tools process information with the same amount of response time irrespective of the level of summarizing.MOLAP tools remove complexities of designing a relational database to store data for analysis.MOLAP server implements two level of storage representation to manage dense and sparse data sets. The storage utilization can be low if the data set is sparse. Facts are stored in multi-dimensional array and dimensions used to query them.MOLAP Architecture MOLAP Architecture includes the following components − Database server.MOLAP server.Front-end tool. Above given MOLAPArchitectures, shown in given figure The user request reports through the interfaceThe application logic layer of the MDDB retrieves the stored data from DatabaseThe application logic layer forwards the result to the client/user. MOLAP architecture mainly reads the precompiled data. MOLAP architecture has limited capabilities to dynamically create aggregations or to calculate results that have not been pre-calculated and stored. For example, an accounting head can run a report showing the corporate P/L account or P/L account for a specific subsidiary. The MDDB would retrieve precompiled Profit & Loss figures and display that result to the user. Implementation considerations is MOLAPIn MOLAP it's essential to consider both maintenance and storage implications to creating strategy for building cubes. Proprietary languages used to query MOLAP. However, it involves extensive click and drag support for example MDX by Microsoft. Difficult to scale because the number and size of cubes required when dimensions increase. API's should provide for probing the cubes. Data structure to support multiple subject areas of data analyses which data can be navigated and analyzed. When the navigation changes, the data structure needs to be physically reorganized.Need different skill set and tools for Database administrator to build, maintain the database.MOLAP AdvantagesMOLAP can manage, analyze and store considerable amounts of multidimensional data.Fast Query Performance due to optimized storage, indexing, and caching.Smaller sizes of data as compared to the relational database.Automated computation of higher level of aggregates data.Help users to analyze larger, less-defined data. MOLAP is easier to the user that's why It is a suitable model for inexperienced users. MOLAP cubes are built for fast data retrieval and are optimal for slicing and dicing operations.All calculations are pre-generated when the cube is created. MOLAP DisadvantagesOne major weakness of MOLAP is that it is less scalable than ROLAP as it handles only a limited amount of data.The MOLAP also introduces data redundancy as it is resource intensiveMOLAP Solutions may be lengthy, particularly on large data volumes. MOLAP products may face issues while updating and querying models when dimensions are more than ten. MOLAP is not capable of containing detailed data. The storage utilization can be low if the data set is highly scattered. It can handle the only limited amount of data therefore, it's impossible to include a large amount of data in the cube itself. MOLAP ToolsEssbase - Tools from Oracle that has a multidimensional database. Express Server - Web-based environment that runs on Oracle database. Yellowfin - Business analytics tools for creating reports and dashboards.Clear Analytics - Clear analytics is an Excel-based business solution. SAP Business Intelligence - Business analytics solutions from SAPSummary: Multidimensional OLAP (MOLAP) is a classical OLAP that facilitates data analysis by using a multidimensional data cube.MOLAP tools process information with the same amount of response time irrespective of the level of summarizing.MOLAP server implements two level of storage to manage dense and sparse data sets.MOLAP can manage, analyze, and store considerable amounts of multidimensional data.It helps to automate computation of higher level of aggregates dataIt is less scalable than ROLAP as it handles only a limited amount of data.  

The post MOLAP, ROLAP and HOLAP appeared first on iExpertify.

]]>
Reading Time: 4 minutes

ROLAP

This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP’s slicing and dicing functionality. In essence, each action of slicing and dicing is equivalent to adding a “WHERE” clause in the SQL statement

Advantages of ROLAP Cube

  • High data efficiency. It offers high data efficiency because query performance and access language are optimized particularly for the multidimensional data analysis.
  • Scalability. This type of OLAP system offers scalability for managing large volumes of data, and even when the data is steadily increasing.

Disadvantages of ROLAP Cube

  • Demand for higher resources: ROLAP needs high utilization of manpower, software, and hardware resources.
  • Aggregately data limitations. ROLAP tools use SQL for all calculation of aggregate data. However, there are no set limits to the for handling computations.
  • Slow query performance. Query performance in this model is slow when compared with MOLAP

Hybrid OLAP

HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP. For summary-type information, HOLAP leverages cube technology for faster performance. When detail information is needed, HOLAP can “drill through” from the cube into the underlying relational data

  1. Aggregated or computed data is stored in a multidimensional OLAP cube
  2. Detailed information is stored in a relational database.

MOLAP

MOLAP uses array-based multidimensional storage engines to display multidimensional views of data. Basically, they use an OLAP cube. Data is pre-computed,pre-summarized, and stored in a MOLAP (a major difference from ROLAP).

Using a MOLAP, a user can use multidimensional view data with different facets. Multidimensional data analysis is also possible if a relational database is used. By that would require querying data from multiple tables. On the contrary, MOLAP has all possible combinations of data already stored in a multidimensional array. MOLAP can access this data directly. Hence, MOLAP is faster compared to Relational Online Analytical Processing (ROLAP).

MOLAP Tools

  1. Essbase – Tools from Oracle that has a multidimensional database.
  2. Express Server – Web-based environment that runs on Oracle database.
  3. Yellowfin – Business analytics tools for creating reports and dashboards.
  4. Clear Analytics – Clear analytics is an Excel-based business solution.
  5. SAP Business Intelligence – Business analytics solutions from SAP

Implementation considerations is MOLAP

  • In MOLAP it’s essential to consider both maintenance and storage implications to creating strategy for building cubes.
  • Proprietary languages used to query MOLAP. However, it involves extensive click and drag support for example MDX by Microsoft.
  • Difficult to scale because the number and size of cubes required when dimensions increase.
  • API’s should provide for probing the cubes.
  • Data structure to support multiple subject areas of data analyses which data can be navigated and analyzed. When the navigation changes, the data structure needs to be physically reorganized.
  • Need different skill set and tools for Database administrator to build, maintain the database.

MOLAP Advantages

  • MOLAP can manage, analyze and store considerable amounts of multidimensional data.
  • Fast Query Performance due to optimized storage, indexing, and caching.
  • Smaller sizes of data as compared to the relational database.
  • Automated computation of higher level of aggregates data.
  • Help users to analyze larger, less-defined data.
  • MOLAP is easier to the user that’s why It is a suitable model for inexperienced users.
  • MOLAP cubes are built for fast data retrieval and are optimal for slicing and dicing operations.
  • All calculations are pre-generated when the cube is created.

MOLAP Disadvantages

  • One major weakness of MOLAP is that it is less scalable than ROLAP as it handles only a limited amount of data.
  • The MOLAP also introduces data redundancy as it is resource intensive
  • MOLAP Solutions may be lengthy, particularly on large data volumes.
  • MOLAP products may face issues while updating and querying models when dimensions are more than ten.
  • MOLAP is not capable of containing detailed data.
  • The storage utilization can be low if the data set is highly scattered.
  • It can handle the only limited amount of data therefore, it’s impossible to include a large amount of data in the cube itself.

The post MOLAP, ROLAP and HOLAP appeared first on iExpertify.

]]>
Difference between Database and Data Warehouse https://www.iexpertify.com/data-warehousing/database-vs-data-warehouse-key-differences/ Tue, 11 May 2021 21:29:06 +0000 https://www.iexpertify.com/?p=2866 Reading Time: 8 minutes What is Database?A database is a collection of related data which represents some elements of the real world. It is designed to be built and populated with data for a specific task. It is also a building block of your data solution. In this tutorial, you will learn What is Database? What is a Data Warehouse? Why use a Database? Why Use Data Warehouse? Characteristics of Database Characteristics of Data Warehouse Difference between Database and Data Warehouse Applications of Database Applications of Data Warehousing Disadvantages of Database Disadvantages of Data Warehouse What is a Data Warehouse?A data warehouse is an information system which stores historical and commutative data from single or multiple sources. It is designed to analyze, report, integrate transaction data from different sources. Data Warehouse eases the analysis and reporting process of an organization. It is also a single version of truth for the organization for decision making and forecasting process. Why use a Database?Here, are prime reasons for using Database system: It offers the security of data and its access A database offers a variety of techniques to store and retrieve data. Database act as an efficient handler to balance the requirement of multiple applications using the same data A DBMS offers integrity constraints to get a high level of protection to prevent access to prohibited data. A database allows you to access concurrent data in such a way that only a single user can access the same data at a time. Why Use Data Warehouse?Here, are Important reasons for using Data Warehouse: Data warehouse helps business users to access critical data from some sources all in one place. It provides consistent information on various cross-functional activities Helps you to integrate many sources of data to reduce stress on the production system. Data warehouse helps you to reduce TAT (total turnaround time) for analysis and reporting. Data warehouse helps users to access critical data from different sources in a single place so, it saves user's time of retrieving data information from multiple sources. You can also access data from the cloud easily. Data warehouse allows you to stores a large amount of historical data to analyze different periods and trends to make future predictions. Enhances the value of operational business applications and customer relationship management systems Separates analytics processing from transactional databases, improving the performance of both systems Stakeholders and users may be overestimating the quality of data in the source systems. Data warehouse provides more accurate reports. Characteristics of DatabaseOffers security and removes redundancy Allow multiple views of the data Database system follows the ACID compliance ( Atomicity, Consistency, Isolation, and Durability). Allows insulation between programs and data Sharing of data and multiuser transaction processing Relational Database support multi-user environment Characteristics of Data WarehouseA data warehouse is subject oriented as it offers information related to theme instead of companies' ongoing operations. The data also needs to be stored in the Datawarehouse in common and unanimously acceptable manner. The time horizon for the data warehouse is relatively extensive compared with other operational systems. A data warehouse is non-volatile which means the previous data is not erased when new information is entered in it.Difference between Database and Data Warehouse Parameter Database Data Warehouse Purpose Is designed to record Is designed to analyze Processing Method The database uses the Online Transactional Processing (OLTP) Data warehouse uses Online Analytical Processing (OLAP). Usage The database helps to perform fundamental operations for your business Data warehouse allows you to analyze your business. Tables and Joins Tables and joins of a database are complex as they are normalized. Table and joins are simple in a data warehouse because they are denormalized. Orientation Is an application-oriented collection of data It is a subject-oriented collection of data Storage limit Generally limited to a single application Stores data from any number of applications Availability Data is available real-time Data is refreshed from source systems as and when needed Usage ER modeling techniques are used for designing. Data modeling techniques are used for designing. Technique Capture data Analyze data Data Type Data stored in the Database is up to date. Current and Historical Data is stored in Data Warehouse. May not be up to date. Storage of data Flat Relational Approach method is used for data storage. Data Ware House uses dimensional and normalized approach for the data structure. Example: Star and snowflake schema. Query Type Simple transaction queries are used. Complex queries are used for analysis purpose. Data Summary Detailed Data is stored in a database. It stores highly summarized data. Applications of DatabaseSector Usage Banking Use in the banking sector for customer information, account-related activities, payments, deposits, loans, credit cards, etc. Airlines Use for reservations and schedule information. Universities To store student information, course registrations, colleges, and results. Telecommunication It helps to store call records, monthly bills, balance maintenance, etc. Finance Helps you to store information related stock, sales, and purchases of stocks and bonds. Sales & Production Use for storing customer, product and sales details. Manufacturing It is used for the data management of the supply chain and for tracking production of items, inventories status. HR Management Detail about employee's salaries, deduction, generation of paychecks, etc. Applications of Data WarehousingSector Usage Airline It is used for airline system management operations like crew assignment, analyzes of route, frequent flyer program discount schemes for passenger, etc. Banking It is used in the banking sector to manage the resources available on the desk effectively. Healthcare sector Data warehouse used to strategize and predict outcomes, create patient's treatment reports, etc. Advanced machine learning, big data enable datawarehouse systems can predict ailments. Insurance sector Data warehouses are widely used to analyze data patterns, customer trends, and to track market movements quickly. Retain chain It helps you to track items, identify the buying pattern of the customer, promotions and also used for determining pricing policy. Telecommunication In this sector, data warehouse used for product promotions, sales decisions and to make distribution decisions. Disadvantages of DatabaseCost of Hardware and Software of an implementing Database system is high which can increase the budget of your organization. Many DBMS systems are often complex systems, so the training for users to use the DBMS is required. DBMS can't perform sophisticated calculations Issues regarding compatibility with systems which is already in place Data owners may lose control over their data, raising security, ownership, and privacy issues. Disadvantages of Data WarehouseAdding new data sources takes time, and it is associated with high cost. Sometimes problems associated with the data warehouse may be undetected for many years. Data warehouses are high maintenance systems. Extracting, loading, and cleaning data could be time-consuming. The data warehouse may look simple, but actually, it is too complicated for the average users. You need to provide training to end-users, who end up not using the data mining and warehouse. Despite best efforts at project management, the scope of data warehousing will always increase. What Works Best for You?To sum up, we can say that the database helps to perform the fundamental operation of business while the data warehouse helps you to analyze your business. You choose either one of them based on your business goals.  

The post Difference between Database and Data Warehouse appeared first on iExpertify.

]]>
Reading Time: 8 minutes

Database and data warehouse are both systems that store data. But they serve very different purposes.

What is a Data Warehouse?

A data warehouse is an information system which stores historical and commutative data from single or multiple sources. Data flows into a data warehouse from transactional systems, relational databases, and other sources, typically on a regular cadence. Business analysts, data engineers, data scientists, and decision makers rely on reports, dashboards, and analytics tools(business intelligence (BI) tools, SQL clients) to extract insights from their data, monitor business performance, and support decision making. It is designed to analyze, report, integrate transaction data from different sources.

Data Warehouse eases the analysis and reporting process of an organization. It is also a single version of truth for the organization for decision making and forecasting process.

What is a Database?

 

A database is a computerized system that makes it easy to search, select and store information. Databases are used in many different places. A database (DB), in the most general sense, is an organized collection of data. Most common examples are relational databases like Oracle, SQL SERVER and MySQL, etc. 

How do data warehouses and databases co-exist?

Typically, businesses use a combination of a database, a data lake, and a data warehouse to store and analyze data. As the volume and variety of data increases, it’s advantageous to follow one or more common patterns for working with data across your database, data lake, and data warehouse. A data warehouse is specially designed for data analytics, which involves reading large amounts of data to understand relationships and trends across the data. A database is used to capture and store data, such as recording details of a transaction.

Why use a Database?

Here, are prime reasons for using Database system:

  • It offers the security of data and its access
  • A database offers a variety of techniques to store and retrieve data.
  • Database act as an efficient handler to balance the requirement of multiple applications using the same data
  • A DBMS offers integrity constraints to get a high level of protection to prevent access to prohibited data.
  • A database allows you to access concurrent data in such a way that only a single user can access the same data at a time.

Why Use Data Warehouse?

Here, are Important reasons for using Data Warehouse:

  • Data warehouse helps business users to access critical data from some sources all in one place.
  • It provides consistent information on various cross-functional activities
  • Helps you to integrate many sources of data to reduce stress on the production system.
  • Data warehouse helps you to reduce TAT (total turnaround time) for analysis and reporting.
  • Data warehouse helps users to access critical data from different sources in a single place so, it saves user’s time of retrieving data information from multiple sources. You can also access data from the cloud easily.
  • Data warehouse allows you to stores a large amount of historical data to analyze different periods and trends to make future predictions.
  • Enhances the value of operational business applications and customer relationship management systems
  • Separates analytics processing from transactional databases, improving the performance of both systems
  • Stakeholders and users may be overestimating the quality of data in the source systems. Data warehouse provides more accurate reports.

Processing Types: OLAP vs OLTP

The most significant difference between databases and data warehouses is how they process data. While most databases are OLTP application files, most data warehouses are online application processing (OLAP) files. OLAP gets information by gathering data from OLTP and other database files. Because of how OLAP files are structured, it’s far easier to run queries and analyses on the data they contain, and anyone can query the data warehouse with either data warehouse software or knowledge of SQL. Individual subsections of the data warehouse, which are typically relevant to an individual team or department, are called “data marts.”

The data in databases are normalized. The goal of normalization is to reduce and even eliminate data redundancy, i.e., storing the same piece of data more than once. This reduction of duplicate data leads to increased consistency and, thus, more accurate data as the database stores it in only one place.

denormalization
denormalization
normalization
normalization

Difference between Database and Data Warehouse

Databases aren’t better than data warehouses, or vice versa. They perform very different functions from one another, and each is very powerful.

Characteristics Data Warehouse Transactional Database
Purpose To store large datasets and historical information to analyze data To store operational data that is current and most up to date
Suitable workloads Analytics, reporting, big data (OLAP) Transaction processing (OLTP)
Data source Data collected and normalized from many sources Data captured as-is from a single source, such as a transactional system
Data capture Bulk write operations typically on a predetermined batch schedule Optimized for continuous write operations as new data is available to maximize transaction throughput
Tables and Joins Denormalized schemas, such as the Star schema or Snowflake schema. Joins are simple Highly normalized, static schemas. Joins are complex.
Data storage Optimized for simplicity of access and high-speed query performance using columnar storage Optimized for high throughout write operations to a single row-oriented physical block
Data access Optimized to minimize I/O and maximize data throughput High volumes of small read operations
Data Modelling Data modeling techniques are used for designing. ER modeling techniques are used for designing.

 

Applications of Database

Sector Usage
Banking Use in the banking sector for customer information, account-related activities, payments, deposits, loans, credit cards, etc.
Airlines Use for reservations and schedule information.
Universities To store student information, course registrations, colleges, and results.
Telecommunication It helps to store call records, monthly bills, balance maintenance, etc.
Finance Helps you to store information related stock, sales, and purchases of stocks and bonds.
Sales & Production Use for storing customer, product and sales details.
Manufacturing It is used for the data management of the supply chain and for tracking production of items, inventories status.
HR Management Detail about employee’s salaries, deduction, generation of paychecks, etc.

Database Use Cases

Databases process the day-to-day transactions in an organization. Some examples of database applications include:

  • An ecommerce website creating an order for a product it has sold
  • An airline using an online booking system
  • A hospital registering a patient
  • A bank adding an ATM withdrawal transaction to an account

Applications of Data Warehousing

Sector Usage
Airline It is used for airline system management operations like crew assignment, analyzes of route, frequent flyer program discount schemes for passenger, etc.
Banking It is used in the banking sector to manage the resources available on the desk effectively.
Healthcare sector Data warehouse used to strategize and predict outcomes, create patient’s treatment reports, etc. Advanced machine learning, big data enable datawarehouse systems can predict ailments.
Insurance sector Data warehouses are widely used to analyze data patterns, customer trends, and to track market movements quickly.
Retain chain It helps you to track items, identify the buying pattern of the customer, promotions and also used for determining pricing policy.
Telecommunication In this sector, data warehouse used for product promotions, sales decisions and to make distribution decisions.

Data Warehouse Use Cases

Data warehouses provide high-level reporting and analysis that empower businesses to make more informed business. Use cases include:

  • Segmenting customers into different groups based on their past purchases to provide them with more tailored content
  • Predicting customer churn using the last ten years of sales data
  • Creating demand and sales forecasts to decide which areas to focus on next quarter

Summary

Now you understand the difference between a database and a data warehouse and when to use which one. Your business needs both an effective database and data warehouse solution to truly succeed in today’s economy.

The post Difference between Database and Data Warehouse appeared first on iExpertify.

]]>
Linux vs Unix https://www.iexpertify.com/linux/linux-vs-unix/ Wed, 28 Apr 2021 13:03:00 +0000 https://www.iexpertify.com/?p=4807 Reading Time: 8 minutes An operating system is the most important software that a computer cannot work without. What is Unix? Unix is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in the 1970s at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and others. There are different versions of Unix, and some of the simplified versions are –… Continue reading Linux vs Unix

The post Linux vs Unix appeared first on iExpertify.

]]>
Reading Time: 8 minutes

An operating system is the most important software that a computer cannot work without.

What is Unix?

Unix is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in the 1970s at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and others. There are different versions of Unix, and some of the simplified versions are – Sun Solaris, GNU, and macOS X.

It began as a one-man project under the leadership of Ken Thompson of Bell Labs. It went on to become most widely used operating systems. Unix is a proprietary operating system.

The Unix OS works on CLI (Command Line Interface), but recently, there have been developments for GUI on Unix systems. Unix is an OS which is popular in companies, universities big enterprises, etc.

What are the features of the Unix operating system?

Some of the prominent features of Unix are as follows:

  • Unix distinguishes itself from its predecessors as the first portable operating system: almost the entire operating system is written in the C programming language, which allows Unix to operate on numerous platforms.
  • It can be used as the master control program in workstations and servers.
  • Hundreds of commercial applications are available
  • In its heydays, UNIX was rapidly adopted and became the standard OS in universities.
  • It was the first operating system that was written in a high-level language and was easy to port to another machine with the least adaptation.
  • It is a multi-user system where the same resources can be shared by multiple users.
  • The functionality of Unix gets extended through a standard programming interface.
  • Unix offers multi-tasking, which means each user can carry out several processes simultaneously.

Unix Architecture

Unix architecture consists of several layers mentioned below:

  • Hardware – It consists of all the hardware related information.
  • Kernel – It is the heart of the operating system that interacts with hardware. It also handles tasks like memory management, task management, file management, power management, etc.
  • Shell – This layer basically processes your request. When a user type commands at the terminal, Shell interprets the commands and calls the program a user needs. The commands that are used are -mv, cat, grep, id, wc, cp, nroff, etc.
  • Application Layer – The application layer includes graphics programs, database management programs, word processors, commands, etc. These programs, as a single unit provide an application to end-users.

What is Linux?

Linux is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged in a Linux distribution.

The Linux OS relays instructions from an application from the computer’s processor and sends the results back to the application via the Linux OS. It can be installed on a different type of computers mobile phones, tablets video game consoles, etc.

The development of Linux is one of the most prominent examples of free and open source software collaboration. Today many companies and similar numbers of individuals have released their own version of OS based on the Linux Kernel.

What are the features of Linux?

  • Supports multitasking
  • It can easily co-exists along with other Operating systems.
  • It can run multiple user programs
  • Individual accounts are protected because of appropriate authorization
  • Linux is a replica of UNIX but does not use its code.
  • Programs might be consist of more than one process, and each process can have multiple threads

Architecture of Linux

The Linux operating system consists of four layers – Hardware, Kernel, Shell, and Application layer. Let us find out more about these layers:

  • Hardware – The hardware layer consists of all the physical devices that are attached to the system. Some of the examples are -RAM, motherboard, CPU, Hard disk drive, etc.
  • Kernel – Kernel is the core layer of the Linux operating system that establishes the direct interaction with the hardware.
  • Shell – This layer acts as an interface that takes inputs from the user and sends it to the kernel and vice versa.
  • Applications – These applications are the utility programs that run on the shell. Some of these applications are -web browsers, media player, text editor, etc.

Differences – Linux vs Unix

Are Linux and Unix the same thing? No, they are not. Unix and Linux are different from each other, but yes they do share a relationship because Linux is derived from Unix. Linux system is a continuation of the Unix design. Is Linux better than Unix? Readout more to find out which is better and helpful for your working style.

Basis of Difference Linux Unix
Source Code The source code is accessible to the general public The source code is not available to anyone
Kernel Follows monolithic kernel approach Can be monolithic, microkernel, or hybrid
Portability It is portable and can be booted from a USB stick It is not portable
Cost Linux is freely distributed and can be downloaded in different ways. Paid versions are also available Different versions of Unix have different prices depending upon the type of vendor.
Development Linux is open-source where thousands of programmers can collaborate online and contribute to its development. The different versions of Unix are developed by AT & T and also by commercial vendors
Text Interface BASH is the Linux default shell and supports multiple command interpreters Originally made to work in Bourne Shell. However, it is now compatible with many others software.
GUI Linux has two GUI – KDE and Gnome. Common desktop environment and Gnome
Threat detection Threat detection is fast in Linux because it is community-driven. If any user posts about a threat, a team of developers start resolving it. Unix users have to wait a little longer for a bug to get fixed
Architecture It is available for more than twenty different CPUs including ARM It is available for PA-RISC and Itanium machines
Supported file systems xfs, nfs, cramfsm ext 1 to 4, ufs, devpts, NTFS. zfs, hfx, GPS, xfs, vxfs
Versions Different Versions of Linux are Redhat, Ubuntu, OpenSuse, etc Different Versions of Unix are HP-UX, AIS, BSD, etc

Linux vs Unix: Commands

There are certain differences between the shell commands of Unix and Linux. The commands of Linux and Unix may look similar but they are not the same.

Solaris vs Linux

Solaris also called Oracle Solaris belongs to the Unix family. Linux is compatible with more system architecture than Solaris does and therefore, Linux is more portable. Solaris seems better when it comes to hardware integration and stability. Linux has a faster development rate as compared to Solaris. 

macOS vs Linux 

macOS is a Unix OS that has its Kernel called XNU. It is used in one of the most reliable PC’s ie. Apple’s computers. macOS is relatively easy to install. Linux is cheaper and more flexible. macOS uses the HFS+ file system, but Linux uses ext4.

Limitation of Linux

  • There is no standard edition of Linux.
  • It has patchier support for drivers that may cause malfunctioning of the entire system.
  • Linux is not very easy to use at least for new users.
  • Many programs that we use for Windows like Microsoft Office will run on Linux with the help of a complicated emulator
  • Linux may be suitable for a corporate user but much harder for a home setting.

Limitations of Unix

  • Unix is designed for a slow computer system; it can’t offer fast performance.
  • It has an unfriendly and non-mnemonic user interface
  • Shell interface can be risky because a single typing error can destroy files.
  • It lacks consistency because its versions are slightly different on various machines.
  • Unix does not give any assured hardware interrupt response time.Is Linux better than Unix?

Linux has gained more popularity because it is more flexible and free when compared to Unix. They are not the same but very much similar, even the commands in each distribution also vary. There are studies that show Linux is growing at a faster rate than any other operating system and it is believed that in the coming future it may leave Unix installation far behind.

Future of Linux and Unix

History of Linux – Linux was introduced in 1991 and got popular in a short period of time. Originally, it was designed only for intel 386, but today it runs on every machine. It has millions of users, and it is doing well in the embedded systems, industrial automation, cloud computing, mobile devices, robotics, etc. it is definitively less popular for the desktop system industry.

Talking about its future, Linus Torvalds, the founder of Linux, has declared some developments and improvements in the features that will be compatible with a wider number of systems.

Unix is an old OS, and its advocates are continuously developing new specifications to make it compatible with the coming era of computing.

The post Linux vs Unix appeared first on iExpertify.

]]>
Introduction to DataStage Director https://www.iexpertify.com/data-warehousing/introduction-to-datastage-director/ Thu, 22 Apr 2021 15:29:22 +0000 https://www.iexpertify.com/?p=12866 Reading Time: 3 minutes In the Datastage Director, we can: run, schedule and monitor jobs view job status, logs and schedules filter the displayed events Click on the datastage director icon to open the application: Fill out the server details, user credentials and choose the project name The DataStage Director window is divided into two panes: The Job Category… Continue reading Introduction to DataStage Director

The post Introduction to DataStage Director appeared first on iExpertify.

]]>
Reading Time: 3 minutes

In the Datastage Director, we can:

  • run, schedule and monitor jobs
  • view job status, logs and schedules
  • filter the displayed events

Click on the datastage director icon to open the application:

director icon

Fill out the server details, user credentials and choose the project name

The DataStage Director window is divided into two panes:

  • The Job Category pane lists all of the jobs in the repository.
  • The right pane shows one of three views: Status view, Schedule view, or Log view.

This table describes DataStage Director menu options:

Menu OptionDescription
ProjectOpen another project, print, or exit.
ViewDisplay or hide the toolbar, status bar, buttons, or job category pane, specify sorting order, change views, filter entries, show more details, or refresh the screen.
SearchStart a text search dialog box.
JobValidate, run, schedule, stop, or reset a job, purge old entries from the job log file, delete unwanted jobs, clean up job resources (if this is enabled), set default job parameter values.
ToolsMonitor running jobs, manage job batches, start the DataStage Designer.
HelpDisplays online help.

DataStage Director has three view options:

  • The Status view displays the status, date and time started, elapsed time, and other run information about each job in the selected repository category.
  • The Schedule view displays job scheduling details.
  • The Log view displays all of the events for a particular run of a job.

To check for job completions, these can be checked in Status column ( Complied, Aborted, Finished, etc.). The start time and end times are also listed in the director

TO see summary of a particular job run, double click the job and the below window with job parameters, status, etc will pop up

job summary datastage director

For debugging, we would need to look at the detailed log. click on the log icon

The log would have info records, warnings and Fatal errors that would help in debugging the issues.

The post Introduction to DataStage Director appeared first on iExpertify.

]]>