What is Data?
Data are facts, figures and other relevant materials past and present serving as bases for study and analysis.
Table of Content
Meaning of Data
The search for answers to research questions calls a collection of Data. “Data are facts, figures and other relevant materials, past and present, serving as bases for study and analysis”.
Characteristics of Data
In order that numerical description may be called data, they must possess the following characteristics:
- Data is aggregate of facts: For example, single unconnected figures can not be used to study the characteristics of a business activity.
- Data is affected to a large extent by multiplicity of factors: For example, in business environment the observations recorded are affected by a number of factors (controllable and uncontrollable).
- Data is estimated according to reasonable standard of accuracy: For example, in the measurement of length one may measure correct upto 0.01 of a cm., the quality of the product is estimated by certain tests on small samples drawn from big lots of products.
- Data is collected in a systematic manner for a predetermined objective: Facts collected in a haphazard manner and without a complete awareness of the objective will be confusing and can not be made the basis of valid conclusions. For example, collected data on price serves no purpose unless one knows whether he wants to collect data on wholesale or retail prices and what are the relevant commodities under considerations.
- Data must be related to one another: The data collected should be comparable, otherwise these can not be placed in relation to each other, example: data on the yield of crop and quality of soil are related but the crop yields cannot have any relation with the data on the health of the people.
- Data must be numerically expressed: That is, any facts to be called data must be numerically or quantitatively expressed. Qualitative characteristics such as beauty, intelligence etc. are called attributes and must be scaled to express in numeric terms.
Types of Data
The Data needed for social science research may be broadly classified into:
- Personal Data (relating to Human beings) are of two types.
- Demographic and socio-economic characteristics of individuals. Like name, sex, race, social class, relation, education, occupation, income etc.
- Behavioural Variables: Attitudes, opinion knowledge, practice, intensions etc.
- Organisation Data: Consist of data relating to an organizations, origin ownership, function, performance etc.
- Territorial Data: are related to geo-physical characteristic, population, infrastructure etc. of divisions like villages, cities, taluks, distinct, state etc.
Sources of Data
The sources of data may be classified into primary sources and secondary sources. Both sources of information have their merits and demerits.
The selection of a particular source depends upon the
- purpose and scope of enquiry,
- availability of time,
- availability of finance,
- accuracy required,
- statistical tools to be used,
- sources of information (data),
- method of data collection.
Primary Sources of Data
Primary sources are original sources from which the researcher directly collects data that have not been previously collected e.g., collection of data directly by the researcher on brand awareness, brand preference, brand loyalty and other aspects of consumer behaviour from a sample of consumers by interviewing them. Primary data are firsthand information collected through various methods such as observation, interviewing, mailing etc.
According to P. V. Young, “primary sources are those data gathered at first hand and the responsibility so of their compilation and promulgations remaining under the same authority that originally gathered them.”
In the words of Watter R. Borg, “Primary sources are direct describing occurrences by an individual who actually observed on the witness for occurrences.”
Advantage of Primary Data
- It is original source of data.
- It is possible to capture the changes occurring in the course of time.
- It flexible to the advantage of researcher.
- Researchers know its accuracy.
- Only that data are collected which meet outs the objective of research project.
- In maximum methods of primary data collection researchers know who are the respondents so face to face communication is there.
- It is most authentic since the information is not filtered or tampered.
- Extensive research study is based of primary data
Disadvantage of Primary Data
- Primary data is expensive to obtain
- It is time consuming
- It requires extensive research personnel who are skilled.
- It is difficult to administer.
- Chances of biasness are at great extent.
- Biasness can also be there on the part of respondent. Wrong answer can be given by then which may affect the accuracy of data.
- It may have narrow coverage. It means researchers may collect data only within his/her reach or according to his mindset.
Secondary Sources of Data
These are sources containing data which have been collected and compiled for another purpose. The secondary sources consists of readily compendia and already compiled statistical statements and reports whose data may be used by researchers for their studies e.g., census reports , annual reports and financial statements of companies, Statistical statement, Reports of Government Departments, Annual reports of currency and finance published by the Reserve Bank of India, Statistical statements relating to Cooperatives and Regional Banks, published by the NABARD, Reports of the National sample survey Organization, Reports of trade associations, publications of international organizations such as UNO, IMF, World Bank, ILO, WHO, etc., Trade and Financial journals newspapers etc.
Secondary sources consist of not only published records and reports, but also unpublished records. The latter category includes various records and registers maintained by the firms and organizations, e.g., accounting and financial records, personnel records, register of members, minutes of meetings, inventory records etc.
Features of Secondary Data
Though secondary sources are diverse and consist of all sorts of materials, they have certain common characteristics.
- First, they are readymade and readily available, and do not require the trouble of constructing tools and administering them.
- Second, they consist of data which a researcher has no original control over collection and classification. Both the form and the content of secondary sources are shaped by others. Clearly, this is a feature which can limit the research value of secondary sources.
- Finally, secondary sources are not limited in time and space. That is, the researcher using them need not have been present when and where they were gathered.
Uses of Secondary Data
The secondary data may be used in three ways by a researcher. First, some specific information from secondary sources may be used for reference purpose. For example, the general statistical information in the number of co-operative credit societies in the country, their coverage of villages, their capital structure, volume of business etc., may be taken from published reports and quoted as background information in a study on the evaluation of performance of cooperative credit societies in a selected district/state.
Second, secondary data may be used as bench marks against which the findings of research may be tested, e.g., the findings of a local or regional survey may be compared with the national averages; the performance indicators of a particular bank may be tested against the corresponding indicators of the banking industry as a whole; and so on.
Finally, secondary data may be used as the sole source of information for a research project. Such studies as securities Market Behaviour, Financial Analysis of companies, Trade in credit allocation in commercial banks, sociological studies on crimes, historical studies, and the like, depend primarily on secondary data.
Year books, statistical reports of government departments, report of public organizations of Bureau of Public Enterprises, Censes Reports etc, serve as major data sources for such research studies.
Advantages of Secondary Data
Secondary sources have some advantages:
- Secondary data, if available can be secured quickly and cheaply. Once their source of documents and reports are located, collection of data is just matter of desk work. Even the tediousness of copying the data from the source can now be avoided, thanks to Xeroxing facilities.
- Wider geographical area and longer reference period may be covered without much cost. Thus, the use of secondary data extends the researcher’s space and time reach.
- The use of secondary data broadens the data base from which scientific generalizations can be made.
- Environmental and cultural settings are required for the study.
- The use of secondary data enables a researcher to verify the findings bases on primary data. It readily meets the need for additional empirical support. The researcher need not wait the time when additional primary data can be collected.
Disadvantages of Secondary Data
Although secondary data are easy to access and cost-effective, they also have significant limitations:
- The secondary data are not up-to-date and become obsolete when they appear in print, because of time lag in producing them. For example, population census data are published two or three years later after compilation and no new figures will be available for another ten years.
- Data may be too broad-based that is, not specific enough to adequately address the firm’s research questions.
- The units in which the data are presented may not be meaningful.
- The source of the data may not provide sufficient supporting material to allow the researcher to judge the quality of the research.
- The data sources may lack reliability and credibility. Some secondary data may simply be inaccurate.
- The most important limitation is the available data may not meet our specific needs. The definitions adopted by those who collected those data may be different; units of measure may not match; and time periods may also be different.
- The available data may not be as accurate as desired. To assess their accuracy we need to know how the data were collected.
- Finally, information about the whereabouts of sources may not be available to all social scientists. Even if the location of the source is known, the accessibility depends primarily on proximity.
For example, most of the unpublished official records and compilations are located in the capital city, and they are not within the easy reach of researchers based in far off places.
Difference Between Primary Data and Secondary Data
- Primary data are collected by the researcher himself, although the secondary data has been collected previously by other researcher.
- Primary data are collected and used first time. However, on the secondary data, some decisions has been made previously, such decisions mayor may not be useful for the researcher now.
- Since primary data is collected by the researcher himself, so it relates directly to the research objective or may be more close to research objective. How many parts of secondary data need not to be related to the research objective?
- Since primary data is collected by the researcher, so it is more time taking activity for the researcher than to get the secondary data.
Processing of Data
The various stages of data analysis process are given below:
Stage 1: Data cleaning
Data cleaning is an important procedure during which the data are inspected, and erroneous data are if necessary, preferable and possible corrected. Data cleaning can be done during the stage of data entry. If this is done, it is important that no subjective decisions are made. It should always be possible to undo any data set alterations. Therefore, it is important not to throw information away at any stage in the data cleaning phase.
All information should be saved (i.e., when altering variables, both the original values and the new values should be kept, either in a duplicate data set or under a different variable name) and all alterations to the data set should carefully and clearly documented, for instance in a syntax or a log.
Stage 2: Initial data analysis
The most important distinction between the initial data analysis phase and the main analysis phase, is that during initial data analysis one refrains from any analysis that are aimed at answering the original research question. The initial data analysis phase is guided by the following four questions
Stage 3: Check the quality of data
The quality of the data should be checked as early as possible. Data quality can be assessed in several ways, using different types of analyses: frequency counts, descriptive statistics (mean, standard deviation, and median), normality (skewness, kurtosis, frequency histograms, normal probability plots), associations (correlations, scatter plots). Other initial data quality checks are:
- Checks on data cleaning have decisions influenced the distribution of the variables? The distribution of the variables before data cleaning is compared to the distribution of the variables after data cleaning to see whether data cleaning has had unwanted effects on the data.
- Analysis of missing observations is there many missing values, and are the values missing at random? The missing observations in the data are analyzed to see whether more than 25% of the values are missing, whether they are missing at random (MAR) and whether some form of imputation is needed.
- Analysis of extreme observations outlying observations in the data are analyzed to see if they seem to disturb the distribution. iv) Comparison and correction of differences in coding schemes variables are compared with coding schemes of variables external to the data set and possibly corrected if coding schemes are not comparable.
Stage 4: Measurement of Quality
The quality of the measurement instruments should only be checked during the initial data analysis phase when this is not the focus or research question of the study. One should check whether structure of measurement instruments corresponds to structure reported in the literature.
Stage 5: Initial transformations
After assessing the quality of the data and of the measurements, one might decide to impute missing data or to perform initial transformations of one or more variables, although this can also be done during the main analysis phase.
Stage 6: Characteristics of data sample
In any report or article, the structure of the sample must be accurately described. It is especially important to exactly determine the structure of the sample (and specifically the size of the subgroups) when subgroup analyses will be performed during the main analysis phase.
The characteristics of the data sample can be assessed by looking at:
- Basic statistics of variables
- Scatter plots
- Correlations
- Cross-tabulations
Stage 7: Final stage of the initial data analysis
During the final stage, the findings of the initial data analysis are documented, and necessary, preferable, and possible corrective actions are taken. Also, the original plan for the main data analyses can and should be specified in more detail and/or rewritten.
Analysis of Data
Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making. Data analysis has multiple facets and approaches, encompassing diverse techniques under a variety of names, in different business, science and social science domains.
Data analysis is a body of methods that help to describe facts, detect patterns, develop explanations and test hypotheses. It is used in all of the sciences. It is used in business, in administration and in policy. The numerical results provided by a data analysis are usually simple. It finds the number that describes a typical value and it finds differences among numbers.
Data analysis finds averages, like the average income or the average temperature and it find differences like the difference in income from group to group or the differences in average temperature from year to year. Fundamentally, the numericalanswers provided by data analysis are that simple. But data analysis is not about numbers it uses them. Data analysis is about the world, asking, always asking, “How does it work?” And that’s where data analysis gets tricky.
Types of Data Analysis
Univariate Data Analysis
Univariate analysis is the simplest form of quantitative (statistical) analysis. The analysis is carried out with the description of a single variable and its attributes of the applicable unit of analysis. For example, if the variable age was the subject of the analysis, the researcher would look at how many subjects fall into a given age attribute categories.
Univariate analysis contrasts with bivariate analysis the analysis of two variables simultaneously ormultivariate analysis the analysis of multiple variables simultaneously. Univariate analysis is also usedprimarily for descriptive purposes, while bivariate and multivariate analysis is geared more towardsexplanatory purposes. Univariate analysis is commonly used in the first stages of research, in analyzingthe data at hand, before being supplemented by more advance, inferential bivariate or multivariateanalysis.
A basic way of presenting univariate data is to create a frequency distribution of the individualcases, which involves presenting the number of attributes of the variable studied for each case observedin the sample. This can be done in a table format, with a bar chart or a similar form of graphicalrepresentation.
Bivariate Data Analysis
Bivariate data is data that has two variables. The quantities from these two variables are oftenrepresented using a scatter plot. This is done so that the relationship (if any) between the variables iseasily seen.
Dependent and Independent Variables
In some instances of bivariate data, it is determined that one variable influences or determines thesecond variable and the terms dependent and independent variables are used to distinguish between thetwo types of variables.Correlations occur between the two variables or data sets. These are determined as strong orweak correlations and are rated on a scale of 0-1.1 being a perfect correlation and 0.1 being a weak correlation.
Analysis of Bivariate Data
In the analysis of bivariate data, one typically either compares summary statistics of each of thevariable quantities or uses regression analysis to find a more direct relationship between the data.
Multivariate Data Analysis
Multivariate analysis (MVA) is based on the statistical principle of multivariate statistics, which involves observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest.
Uses for multivariate analysis include:
- Design for capability (also known as capability-based design).
- Inverse design, where any variable can be treated as an independent variable.
- Analysis of Alternatives (AoA), the selection of concepts to fulfill a customer need.
- Analysis of concepts with respect to changing scenarios.
- Identification of critical design drivers and correlations across hierarchical levels.
Business Ethics
(Click on Topic to Read)
- What is Ethics?
- What is Business Ethics?
- Values, Norms, Beliefs and Standards in Business Ethics
- Indian Ethos in Management
- Ethical Issues in Marketing
- Ethical Issues in HRM
- Ethical Issues in IT
- Ethical Issues in Production and Operations Management
- Ethical Issues in Finance and Accounting
- What is Corporate Governance?
- What is Ownership Concentration?
- What is Ownership Composition?
- Types of Companies in India
- Internal Corporate Governance
- External Corporate Governance
- Corporate Governance in India
- What is Enterprise Risk Management (ERM)?
- What is Assessment of Risk?
- What is Risk Register?
- Risk Management Committee
Corporate social responsibility (CSR)
Lean Six Sigma
- Project Decomposition in Six Sigma
- Critical to Quality (CTQ) Six Sigma
- Process Mapping Six Sigma
- Flowchart and SIPOC
- Gage Repeatability and Reproducibility
- Statistical Diagram
- Lean Techniques for Optimisation Flow
- Failure Modes and Effects Analysis (FMEA)
- What is Process Audits?
- Six Sigma Implementation at Ford
- IBM Uses Six Sigma to Drive Behaviour Change
Research Methodology
Management
Operations Research
Operation Management
- What is Strategy?
- What is Operations Strategy?
- Operations Competitive Dimensions
- Operations Strategy Formulation Process
- What is Strategic Fit?
- Strategic Design Process
- Focused Operations Strategy
- Corporate Level Strategy
- Expansion Strategies
- Stability Strategies
- Retrenchment Strategies
- Competitive Advantage
- Strategic Choice and Strategic Alternatives
- What is Production Process?
- What is Process Technology?
- What is Process Improvement?
- Strategic Capacity Management
- Production and Logistics Strategy
- Taxonomy of Supply Chain Strategies
- Factors Considered in Supply Chain Planning
- Operational and Strategic Issues in Global Logistics
- Logistics Outsourcing Strategy
- What is Supply Chain Mapping?
- Supply Chain Process Restructuring
- Points of Differentiation
- Re-engineering Improvement in SCM
- What is Supply Chain Drivers?
- Supply Chain Operations Reference (SCOR) Model
- Customer Service and Cost Trade Off
- Internal and External Performance Measures
- Linking Supply Chain and Business Performance
- Netflix’s Niche Focused Strategy
- Disney and Pixar Merger
- Process Planning at Mcdonald’s
Service Operations Management
Procurement Management
- What is Procurement Management?
- Procurement Negotiation
- Types of Requisition
- RFX in Procurement
- What is Purchasing Cycle?
- Vendor Managed Inventory
- Internal Conflict During Purchasing Operation
- Spend Analysis in Procurement
- Sourcing in Procurement
- Supplier Evaluation and Selection in Procurement
- Blacklisting of Suppliers in Procurement
- Total Cost of Ownership in Procurement
- Incoterms in Procurement
- Documents Used in International Procurement
- Transportation and Logistics Strategy
- What is Capital Equipment?
- Procurement Process of Capital Equipment
- Acquisition of Technology in Procurement
- What is E-Procurement?
- E-marketplace and Online Catalogues
- Fixed Price and Cost Reimbursement Contracts
- Contract Cancellation in Procurement
- Ethics in Procurement
- Legal Aspects of Procurement
- Global Sourcing in Procurement
- Intermediaries and Countertrade in Procurement
Strategic Management
- What is Strategic Management?
- What is Value Chain Analysis?
- Mission Statement
- Business Level Strategy
- What is SWOT Analysis?
- What is Competitive Advantage?
- What is Vision?
- What is Ansoff Matrix?
- Prahalad and Gary Hammel
- Strategic Management In Global Environment
- Competitor Analysis Framework
- Competitive Rivalry Analysis
- Competitive Dynamics
- What is Competitive Rivalry?
- Five Competitive Forces That Shape Strategy
- What is PESTLE Analysis?
- Fragmentation and Consolidation Of Industries
- What is Technology Life Cycle?
- What is Diversification Strategy?
- What is Corporate Restructuring Strategy?
- Resources and Capabilities of Organization
- Role of Leaders In Functional-Level Strategic Management
- Functional Structure In Functional Level Strategy Formulation
- Information And Control System
- What is Strategy Gap Analysis?
- Issues In Strategy Implementation
- Matrix Organizational Structure
- What is Strategic Management Process?
Supply Chain



