Page 1
5.1 Introduct Ion to d ata Many a time, people take decisions based on
certain data or information. For example, while
choosing a college for getting admission, one
looks at placement data of previous years of that
college, educational qualification and experience
of the faculty members, laboratory and hostel
facilities, fees, etc. So we can say that identification
of a college is based on various data and their
analysis. Governments systematically collect
and record data about the population through
a process called census. Census data contains
“Data is not information, Information
is not knowledge, Knowledge is not
understanding, Understanding is not
wisdom.”
— Gary Schubert
In this chapter
» Introduction to Data
» Data Collection
» Data Storage
» Data Processing
» Statistical Techniques
for Data Processing
Understanding
Data
Chapter
5
Chap 5.indd 81 09-Aug-19 11:49:18 AM
2024-25
Page 2
5.1 Introduct Ion to d ata Many a time, people take decisions based on
certain data or information. For example, while
choosing a college for getting admission, one
looks at placement data of previous years of that
college, educational qualification and experience
of the faculty members, laboratory and hostel
facilities, fees, etc. So we can say that identification
of a college is based on various data and their
analysis. Governments systematically collect
and record data about the population through
a process called census. Census data contains
“Data is not information, Information
is not knowledge, Knowledge is not
understanding, Understanding is not
wisdom.”
— Gary Schubert
In this chapter
» Introduction to Data
» Data Collection
» Data Storage
» Data Processing
» Statistical Techniques
for Data Processing
Understanding
Data
Chapter
5
Chap 5.indd 81 09-Aug-19 11:49:18 AM
2024-25
82
Informat Ics Pract Ices – c lass XI
valuable information which are helpful is planning and
formulating policies. Likewise, the coaching staff of a
sports team analyses previous performances of opponent
teams for making strategies. Banks maintain data about
the customers, their account details and transactions.
All these examples highlight the need of data in various
fields. Data are indeed crucial for decision making.
In the previous examples, one cannot make decisions
by looking at the data itself. In our example of choosing
a college, suppose the placement cell of the college has
maintained data of about 2000 students placed with
different companies at different salary packages in the
last 3 years. Looking at such data, one cannot make
any remark about the placement of students of that
college. The college processes and analyses this data
and the results are given in the placement brochure of
the college through summarisation as well as visuals for
easy understanding. Hence, data need to be gathered,
processed and analysed for making decisions.
In general, data is a collection of characters, numbers,
and other symbols that represents values of some
situations or variables. Data is plural and singular of the
word data is “datum”. Using computers, data are stored
in electronic forms because data processing becomes
faster and easier as compared to manual data processing
done by people. The Information and Communication
Technology (ICT) revolution led by computer, mobile and
Internet has resulted in generation of large volume of
data and at a very fast pace. The following list contains
some examples of data that we often come across.
• Name, age, gender, contact details, etc., of a person
• Transactions data generated through banking,
ticketing, shopping, etc. whether online or offline
• Images, graphics, animations, audio, video
• Documents and web pages
• Online posts, comments and messages
• Signals generated by sensors
• Satellite data including meteorological data,
communication data, earth observation data, etc.
5.1.1 Importance of Data
Human beings rely on data for making decisions.
Besides, large amount of data when processed with the
help of a computer, show us the possibilities or hidden
A knowledge base is
a store of information
consisting of facts,
assumptions and
rules which an AI
system can use for
decision making.
Chap 5.indd 82 09-Aug-19 11:49:19 AM
2024-25
Page 3
5.1 Introduct Ion to d ata Many a time, people take decisions based on
certain data or information. For example, while
choosing a college for getting admission, one
looks at placement data of previous years of that
college, educational qualification and experience
of the faculty members, laboratory and hostel
facilities, fees, etc. So we can say that identification
of a college is based on various data and their
analysis. Governments systematically collect
and record data about the population through
a process called census. Census data contains
“Data is not information, Information
is not knowledge, Knowledge is not
understanding, Understanding is not
wisdom.”
— Gary Schubert
In this chapter
» Introduction to Data
» Data Collection
» Data Storage
» Data Processing
» Statistical Techniques
for Data Processing
Understanding
Data
Chapter
5
Chap 5.indd 81 09-Aug-19 11:49:18 AM
2024-25
82
Informat Ics Pract Ices – c lass XI
valuable information which are helpful is planning and
formulating policies. Likewise, the coaching staff of a
sports team analyses previous performances of opponent
teams for making strategies. Banks maintain data about
the customers, their account details and transactions.
All these examples highlight the need of data in various
fields. Data are indeed crucial for decision making.
In the previous examples, one cannot make decisions
by looking at the data itself. In our example of choosing
a college, suppose the placement cell of the college has
maintained data of about 2000 students placed with
different companies at different salary packages in the
last 3 years. Looking at such data, one cannot make
any remark about the placement of students of that
college. The college processes and analyses this data
and the results are given in the placement brochure of
the college through summarisation as well as visuals for
easy understanding. Hence, data need to be gathered,
processed and analysed for making decisions.
In general, data is a collection of characters, numbers,
and other symbols that represents values of some
situations or variables. Data is plural and singular of the
word data is “datum”. Using computers, data are stored
in electronic forms because data processing becomes
faster and easier as compared to manual data processing
done by people. The Information and Communication
Technology (ICT) revolution led by computer, mobile and
Internet has resulted in generation of large volume of
data and at a very fast pace. The following list contains
some examples of data that we often come across.
• Name, age, gender, contact details, etc., of a person
• Transactions data generated through banking,
ticketing, shopping, etc. whether online or offline
• Images, graphics, animations, audio, video
• Documents and web pages
• Online posts, comments and messages
• Signals generated by sensors
• Satellite data including meteorological data,
communication data, earth observation data, etc.
5.1.1 Importance of Data
Human beings rely on data for making decisions.
Besides, large amount of data when processed with the
help of a computer, show us the possibilities or hidden
A knowledge base is
a store of information
consisting of facts,
assumptions and
rules which an AI
system can use for
decision making.
Chap 5.indd 82 09-Aug-19 11:49:19 AM
2024-25
Understanding d ata 83
traits which are otherwise not visible to humans. When
one withdraws money from ATM, the bank needs to debit
the withdrawn amount from the linked account. So the
bank needs to maintain data and update it as and when
required. The meteorological offices continuously keep
on monitoring satellite data for any upcoming cyclone
or heavy rain.
In a competitive business environment, it is important
for business organisations to continuously monitor and
analyse market behavior with respect to their products
and take actions accordingly. Besides, companies
identify customer demands as well as feedbacks, and
make changes in their products or services accordingly.
The dynamic pricing concept used by airlines and
railway is another example where they decide the price
based on relationships between demand and supply.
The cab booking Apps increase or decrease the price
based on demand for cabs at a particular time. Certain
restaurants offer discounted price (called happy hours),
they decide when and how much discount to offer by
analysing sales data at different time periods.
Besides business, following are some other scenarios
where data are also stored and analysed for making
decisions:
• The electronic voting machines are used for recording
the votes cast. Subsequently, the voting data from
all the machines are accumulated to declare election
results in a short time as compared to manual
counting of ballot papers.
• Scientists record data while doing experiments to
calculate and compare results.
• Pharmaceutical companies record data while trying
out a new medicine to see its effectiveness.
• Libraries maintain data about books in the library
and the membership of the library.
• The search engines give us results after analysing
large volume of data available on the websites across
World Wide Web (www).
• Weather alerts are generated by analysing data
received from various satellites.
5.1.2 Types of Data
As data come from different sources, they can be in
different formats. For example, an image is a collection
n otes Chap 5.indd 83 09-Aug-19 11:49:19 AM
2024-25
Page 4
5.1 Introduct Ion to d ata Many a time, people take decisions based on
certain data or information. For example, while
choosing a college for getting admission, one
looks at placement data of previous years of that
college, educational qualification and experience
of the faculty members, laboratory and hostel
facilities, fees, etc. So we can say that identification
of a college is based on various data and their
analysis. Governments systematically collect
and record data about the population through
a process called census. Census data contains
“Data is not information, Information
is not knowledge, Knowledge is not
understanding, Understanding is not
wisdom.”
— Gary Schubert
In this chapter
» Introduction to Data
» Data Collection
» Data Storage
» Data Processing
» Statistical Techniques
for Data Processing
Understanding
Data
Chapter
5
Chap 5.indd 81 09-Aug-19 11:49:18 AM
2024-25
82
Informat Ics Pract Ices – c lass XI
valuable information which are helpful is planning and
formulating policies. Likewise, the coaching staff of a
sports team analyses previous performances of opponent
teams for making strategies. Banks maintain data about
the customers, their account details and transactions.
All these examples highlight the need of data in various
fields. Data are indeed crucial for decision making.
In the previous examples, one cannot make decisions
by looking at the data itself. In our example of choosing
a college, suppose the placement cell of the college has
maintained data of about 2000 students placed with
different companies at different salary packages in the
last 3 years. Looking at such data, one cannot make
any remark about the placement of students of that
college. The college processes and analyses this data
and the results are given in the placement brochure of
the college through summarisation as well as visuals for
easy understanding. Hence, data need to be gathered,
processed and analysed for making decisions.
In general, data is a collection of characters, numbers,
and other symbols that represents values of some
situations or variables. Data is plural and singular of the
word data is “datum”. Using computers, data are stored
in electronic forms because data processing becomes
faster and easier as compared to manual data processing
done by people. The Information and Communication
Technology (ICT) revolution led by computer, mobile and
Internet has resulted in generation of large volume of
data and at a very fast pace. The following list contains
some examples of data that we often come across.
• Name, age, gender, contact details, etc., of a person
• Transactions data generated through banking,
ticketing, shopping, etc. whether online or offline
• Images, graphics, animations, audio, video
• Documents and web pages
• Online posts, comments and messages
• Signals generated by sensors
• Satellite data including meteorological data,
communication data, earth observation data, etc.
5.1.1 Importance of Data
Human beings rely on data for making decisions.
Besides, large amount of data when processed with the
help of a computer, show us the possibilities or hidden
A knowledge base is
a store of information
consisting of facts,
assumptions and
rules which an AI
system can use for
decision making.
Chap 5.indd 82 09-Aug-19 11:49:19 AM
2024-25
Understanding d ata 83
traits which are otherwise not visible to humans. When
one withdraws money from ATM, the bank needs to debit
the withdrawn amount from the linked account. So the
bank needs to maintain data and update it as and when
required. The meteorological offices continuously keep
on monitoring satellite data for any upcoming cyclone
or heavy rain.
In a competitive business environment, it is important
for business organisations to continuously monitor and
analyse market behavior with respect to their products
and take actions accordingly. Besides, companies
identify customer demands as well as feedbacks, and
make changes in their products or services accordingly.
The dynamic pricing concept used by airlines and
railway is another example where they decide the price
based on relationships between demand and supply.
The cab booking Apps increase or decrease the price
based on demand for cabs at a particular time. Certain
restaurants offer discounted price (called happy hours),
they decide when and how much discount to offer by
analysing sales data at different time periods.
Besides business, following are some other scenarios
where data are also stored and analysed for making
decisions:
• The electronic voting machines are used for recording
the votes cast. Subsequently, the voting data from
all the machines are accumulated to declare election
results in a short time as compared to manual
counting of ballot papers.
• Scientists record data while doing experiments to
calculate and compare results.
• Pharmaceutical companies record data while trying
out a new medicine to see its effectiveness.
• Libraries maintain data about books in the library
and the membership of the library.
• The search engines give us results after analysing
large volume of data available on the websites across
World Wide Web (www).
• Weather alerts are generated by analysing data
received from various satellites.
5.1.2 Types of Data
As data come from different sources, they can be in
different formats. For example, an image is a collection
n otes Chap 5.indd 83 09-Aug-19 11:49:19 AM
2024-25
84
Informat Ics Pract Ices – c lass XI
of pixels; a video is made up of frames; a fee slip is
made up of few numeric and non-numeric entries; and
messages/chats are made up of texts, icons (emoticons)
and images/videos. Two broad categories in which data
can be classified on the basis of their format are:
(A) Structured Data
Data which is organised and can be recorded in a well
defined format is called structured data. Structured
data is usually stored in computer in a tabular (in rows
and columns) format where each column represents
different data for a particular parameter called attribute/
characteristic/variable and each row represents data of
an observation for different attributes. Table 5.1 shows
structured data related to an inventory of kitchen items
maintained by a shop.
Given this data, using a spreadsheet or other such
software, the shop owner can find out how many total
items are there by summing the column Items_in_
Inventory of Table 5.1 The owner of the shop can also
calculate the total value of all items in the inventory
by multiplying each entry of column 3 (Unit Price) with
corresponding entry of column 5 (Items_in_Inventory)
and finding their sum.
Table 5.2 shows more examples of structured data
recorded for different attributes.
Table 5.2 Attributes maintained for different activities
Entity/Activities Data Fields/Parameters/Attributes
Books at a shop BookTitle, Author, Price, YearofPublication
Depositing fees in a school StudentName, Class, RollNo, FeesAmount, DepositDate
Amount withdrawal from ATM AccHolderName, AccountNo, TypeofAcc, DateofWithdrawal,
AmountWithdrawn, ATMid, TimeOfWithdrawal
Table 5.1 Structured data about kitchen items in a shop
ModelNo ProductName Unit Price Discount(%) Items_in_Inventory
ABC1 Water bottle 126 8 13
ABC2 Melamine Plates 320 5 45
ABC3 Dinner Set 4200 10 8
GH67 Jug 80 0 10
GH78 Table Spoon 120 5 14
GH81 Bucket 190 12 6
NK2 Kitchen Towel 25 0 32
Activity 5.1
Observe Voter Identity
cards of your family
members and identify
the data fields under
which data are
organised. Are they
same for all?
Chap 5.indd 84 09-Aug-19 11:49:19 AM
2024-25
Page 5
5.1 Introduct Ion to d ata Many a time, people take decisions based on
certain data or information. For example, while
choosing a college for getting admission, one
looks at placement data of previous years of that
college, educational qualification and experience
of the faculty members, laboratory and hostel
facilities, fees, etc. So we can say that identification
of a college is based on various data and their
analysis. Governments systematically collect
and record data about the population through
a process called census. Census data contains
“Data is not information, Information
is not knowledge, Knowledge is not
understanding, Understanding is not
wisdom.”
— Gary Schubert
In this chapter
» Introduction to Data
» Data Collection
» Data Storage
» Data Processing
» Statistical Techniques
for Data Processing
Understanding
Data
Chapter
5
Chap 5.indd 81 09-Aug-19 11:49:18 AM
2024-25
82
Informat Ics Pract Ices – c lass XI
valuable information which are helpful is planning and
formulating policies. Likewise, the coaching staff of a
sports team analyses previous performances of opponent
teams for making strategies. Banks maintain data about
the customers, their account details and transactions.
All these examples highlight the need of data in various
fields. Data are indeed crucial for decision making.
In the previous examples, one cannot make decisions
by looking at the data itself. In our example of choosing
a college, suppose the placement cell of the college has
maintained data of about 2000 students placed with
different companies at different salary packages in the
last 3 years. Looking at such data, one cannot make
any remark about the placement of students of that
college. The college processes and analyses this data
and the results are given in the placement brochure of
the college through summarisation as well as visuals for
easy understanding. Hence, data need to be gathered,
processed and analysed for making decisions.
In general, data is a collection of characters, numbers,
and other symbols that represents values of some
situations or variables. Data is plural and singular of the
word data is “datum”. Using computers, data are stored
in electronic forms because data processing becomes
faster and easier as compared to manual data processing
done by people. The Information and Communication
Technology (ICT) revolution led by computer, mobile and
Internet has resulted in generation of large volume of
data and at a very fast pace. The following list contains
some examples of data that we often come across.
• Name, age, gender, contact details, etc., of a person
• Transactions data generated through banking,
ticketing, shopping, etc. whether online or offline
• Images, graphics, animations, audio, video
• Documents and web pages
• Online posts, comments and messages
• Signals generated by sensors
• Satellite data including meteorological data,
communication data, earth observation data, etc.
5.1.1 Importance of Data
Human beings rely on data for making decisions.
Besides, large amount of data when processed with the
help of a computer, show us the possibilities or hidden
A knowledge base is
a store of information
consisting of facts,
assumptions and
rules which an AI
system can use for
decision making.
Chap 5.indd 82 09-Aug-19 11:49:19 AM
2024-25
Understanding d ata 83
traits which are otherwise not visible to humans. When
one withdraws money from ATM, the bank needs to debit
the withdrawn amount from the linked account. So the
bank needs to maintain data and update it as and when
required. The meteorological offices continuously keep
on monitoring satellite data for any upcoming cyclone
or heavy rain.
In a competitive business environment, it is important
for business organisations to continuously monitor and
analyse market behavior with respect to their products
and take actions accordingly. Besides, companies
identify customer demands as well as feedbacks, and
make changes in their products or services accordingly.
The dynamic pricing concept used by airlines and
railway is another example where they decide the price
based on relationships between demand and supply.
The cab booking Apps increase or decrease the price
based on demand for cabs at a particular time. Certain
restaurants offer discounted price (called happy hours),
they decide when and how much discount to offer by
analysing sales data at different time periods.
Besides business, following are some other scenarios
where data are also stored and analysed for making
decisions:
• The electronic voting machines are used for recording
the votes cast. Subsequently, the voting data from
all the machines are accumulated to declare election
results in a short time as compared to manual
counting of ballot papers.
• Scientists record data while doing experiments to
calculate and compare results.
• Pharmaceutical companies record data while trying
out a new medicine to see its effectiveness.
• Libraries maintain data about books in the library
and the membership of the library.
• The search engines give us results after analysing
large volume of data available on the websites across
World Wide Web (www).
• Weather alerts are generated by analysing data
received from various satellites.
5.1.2 Types of Data
As data come from different sources, they can be in
different formats. For example, an image is a collection
n otes Chap 5.indd 83 09-Aug-19 11:49:19 AM
2024-25
84
Informat Ics Pract Ices – c lass XI
of pixels; a video is made up of frames; a fee slip is
made up of few numeric and non-numeric entries; and
messages/chats are made up of texts, icons (emoticons)
and images/videos. Two broad categories in which data
can be classified on the basis of their format are:
(A) Structured Data
Data which is organised and can be recorded in a well
defined format is called structured data. Structured
data is usually stored in computer in a tabular (in rows
and columns) format where each column represents
different data for a particular parameter called attribute/
characteristic/variable and each row represents data of
an observation for different attributes. Table 5.1 shows
structured data related to an inventory of kitchen items
maintained by a shop.
Given this data, using a spreadsheet or other such
software, the shop owner can find out how many total
items are there by summing the column Items_in_
Inventory of Table 5.1 The owner of the shop can also
calculate the total value of all items in the inventory
by multiplying each entry of column 3 (Unit Price) with
corresponding entry of column 5 (Items_in_Inventory)
and finding their sum.
Table 5.2 shows more examples of structured data
recorded for different attributes.
Table 5.2 Attributes maintained for different activities
Entity/Activities Data Fields/Parameters/Attributes
Books at a shop BookTitle, Author, Price, YearofPublication
Depositing fees in a school StudentName, Class, RollNo, FeesAmount, DepositDate
Amount withdrawal from ATM AccHolderName, AccountNo, TypeofAcc, DateofWithdrawal,
AmountWithdrawn, ATMid, TimeOfWithdrawal
Table 5.1 Structured data about kitchen items in a shop
ModelNo ProductName Unit Price Discount(%) Items_in_Inventory
ABC1 Water bottle 126 8 13
ABC2 Melamine Plates 320 5 45
ABC3 Dinner Set 4200 10 8
GH67 Jug 80 0 10
GH78 Table Spoon 120 5 14
GH81 Bucket 190 12 6
NK2 Kitchen Towel 25 0 32
Activity 5.1
Observe Voter Identity
cards of your family
members and identify
the data fields under
which data are
organised. Are they
same for all?
Chap 5.indd 84 09-Aug-19 11:49:19 AM
2024-25
Understanding d ata 85
(B) Unstructured Data
A newspaper contains various types of news items
which are also called data. But there is no fixed pattern
that a newspaper follows in placing news articles. One
day there might be three images of different sizes on
a page along with five news items and one or more
advertisements. While on another day there, might be
one big image with three textual news items. So there is
no particular format nor any fixed structure for printing
news. Another example is the content of an email.
There is no fixed structure about how many lines or
paragraphs one has to write in an email or how many
files are to be attached with an email. In summary,
data which are not in the traditional row and column
structure is called unstructured data.
Examples of unstructured data include web pages
consisting of text as well as multimedia contents
(image, graphics, audio/video). Other examples include
text documents, business reports, books, audio/video
files, social media messages. Although there are ways
to process unstructured data, we are going to focus on
handling structured data only in this book.
Unstructured data are sometimes described with
the help of some other data called metadata. Metadata
is basically data about data. For example, we describe
different parts of an email as subject, recipient, main
body, attachment, etc. These are the metadata for the
email data. Likewise, we can have some metadata for an
image file as image size (in KB or MB), image type (for
example, JPEG, PNG), image resolution, etc.
5.2 d ata c ollect Ion For processing data, we need to collect or gather data
first. We can then store the data in a file or database
for later use. Data collection here means identifying
already available data or collecting from the appropriate
sources. Suppose there are three different scenarios
where sales data in a grocery store are available:
• Sales data are available with the shopkeeper in a
diary or register. In this case we should enter the
data in a digital format for example, in a spreadsheet.
• Data are already available in a digital format, say in
a CSV (comma separated values) file.
• The shopkeeper has so far not recorded any data in
either form but wants to get a software developed for
Think and Reflect
When we click a
photograph using
our digital or mobile
camera, does it have
some metadata
associated with it?
Chap 5.indd 85 09-Aug-19 11:49:19 AM
2024-25
Read More