Data Collection in Research

  • Post last modified:8 January 2022
  • Reading time:8 mins read

What is Data?

Data are facts, figures and other relevant materials past and present serving as bases for study and analysis.

Meaning of Data

The search for answers to research questions calls a collection of Data. “Data are facts, figures and other relevant materials, past and present, serving as bases for study and analysis”.


Types of Data

The Data needed for social science research may be broadly classified into:

  • Personal Data (relating to Human beings) are of two types.
    • Demographic and socio-economic characteristics of individuals. Like name, sex, race, social class, relation, education, occupation, income etc.
    • Behavioural Variables: Attitudes, opinion knowledge, practice, intensions etc.

  • Organisation Data: Consist of data relating to an organizations, origin ownership, function, performance etc.

  • Territorial Data: are related to geo-physical characteristic, population, infrastructure etc. of divisions like villages, cities, taluks, distinct, state etc.

Sources of Data

The sources of data may be classified into a) primary sources b) secondary sources.

Both the sources of information have their merits and demerits. The selection of a particular source depends upon the (a) purpose and scope of enquiry, (b) availability of time, (c) availability of finance, (d) accuracy required, (e) statistical tools to be used, (f) sources of information (data), and (g) method of data collection.

Primary Sources of Data

Primary sources are original sources from which the researcher directly collects data that have not been previously collected e.g., collection of data directly by the researcher on brand awareness, brand preference, brand loyalty and other aspects of consumer behaviour from a sample of consumers by interviewing them. Primary data are firsthand information collected through various methods such as observation, interviewing, mailing etc.

According to P. V. Young, “primary sources are those data gathered at first hand and the responsibility so of their compilation and promulgations remaining under the same authority that originally gathered them.”

In the words of Watter R. Borg, “Primary sources are direct describing occurrences by an individual who actually observed on the witness for occurrences.”

Advantage of Primary Data

  • It is original source of data.

  • It is possible to capture the changes occurring in the course of time.

  • It flexible to the advantage of researcher.

  • Researchers know its accuracy.

  • Only that data are collected which meet outs the objective of research project.

  • In maximum methods of primary data collection researchers know who are the respondents so face to face communication is there.

  • It is most authentic since the information is not filtered or tampered.

  • Extensive research study is based of primary data

Disadvantage of Primary Data

  • Primary data is expensive to obtain

  • It is time consuming

  • It requires extensive research personnel who are skilled.

  • It is difficult to administer.

  • Chances of biasness are at great extent.

  • Biasness can also be there on the part of respondent. Wrong answer can be given by then which may affect the accuracy of data.

  • It may have narrow coverage. It means researchers may collect data only within his/her reach or according to his mindset.

Secondary Sources of Data

These are sources containing data which have been collected and compiled for another purpose. The secondary sources consists of readily compendia and already compiled statistical statements and reports whose data may be used by researchers for their studies e.g., census reports , annual reports and financial statements of companies, Statistical statement, Reports of Government Departments, Annual reports of currency and finance published by the Reserve Bank of India, Statistical statements relating to Cooperatives and Regional Banks, published by the NABARD, Reports of the National sample survey Organization, Reports of trade associations, publications of international organizations such as UNO, IMF, World Bank, ILO, WHO, etc., Trade and Financial journals newspapers etc.

Secondary sources consist of not only published records and reports, but also unpublished records. The latter category includes various records and registers maintained by the firms and organizations, e.g., accounting and financial records, personnel records, register of members, minutes of meetings, inventory records etc.

Features of Secondary Data

Though secondary sources are diverse and consist of all sorts of materials, they have certain common characteristics.

  • First, they are readymade and readily available, and do not require the trouble of constructing tools and administering them.

  • Second, they consist of data which a researcher has no original control over collection and classification. Both the form and the content of secondary sources are shaped by others. Clearly, this is a feature which can limit the research value of secondary sources.

  • Finally, secondary sources are not limited in time and space. That is, the researcher using them need not have been present when and where they were gathered.

Uses of Secondary Data

The secondary data may be used in three ways by a researcher. First, some specific information from secondary sources may be used for reference purpose. For example, the general statistical information in the number of co-operative credit societies in the country, their coverage of villages, their capital structure, volume of business etc., may be taken from published reports and quoted as background information in a study on the evaluation of performance of cooperative credit societies in a selected district/state.

Second, secondary data may be used as bench marks against which the findings of research may be tested, e.g., the findings of a local or regional survey may be compared with the national averages; the performance indicators of a particular bank may be tested against the corresponding indicators of the banking industry as a whole; and so on.

Finally, secondary data may be used as the sole source of information for a research project. Such studies as securities Market Behaviour, Financial Analysis of companies, Trade in credit allocation in commercial banks, sociological studies on crimes, historical studies, and the like, depend primarily on secondary data.

Year books, statistical reports of government departments, report of public organizations of Bureau of Public Enterprises, Censes Reports etc, serve as major data sources for such research studies.

Advantages of Secondary Data

Secondary sources have some advantages:

  • Secondary data, if available can be secured quickly and cheaply. Once their source of documents and reports are located, collection of data is just matter of desk work. Even the tediousness of copying the data from the source can now be avoided, thanks to Xeroxing facilities.

  • Wider geographical area and longer reference period may be covered without much cost. Thus, the use of secondary data extends the researcher’s space and time reach.

  • The use of secondary data broadens the data base from which scientific generalizations can be made.

  • Environmental and cultural settings are required for the study.

  • The use of secondary data enables a researcher to verify the findings bases on primary data. It readily meets the need for additional empirical support. The researcher need not wait the time when additional primary data can be collected.

Disadvantages of Secondary Data

Although secondary data are easy to access and cost-effective, they also have significant limitations:

  • The secondary data are not up-to-date and become obsolete when they appear in print, because of time lag in producing them. For example, population census data are published two or three years later after compilation and no new figures will be available for another ten years.

  • Data may be too broad-based that is, not specific enough to adequately address the firm’s research questions.

  • The units in which the data are presented may not be meaningful.

  • The source of the data may not provide sufficient supporting material to allow the researcher to judge the quality of the research.
  • The data sources may lack reliability and credibility. Some secondary data may simply be inaccurate.

  • The most important limitation is the available data may not meet our specific needs. The definitions adopted by those who collected those data may be different; units of measure may not match; and time periods may also be different.

  • The available data may not be as accurate as desired. To assess their accuracy we need to know how the data were collected.

  • Finally, information about the whereabouts of sources may not be available to all social scientists. Even if the location of the source is known, the accessibility depends primarily on proximity.

    For example, most of the unpublished official records and compilations are located in the capital city, and they are not within the easy reach of researchers based in far off places.

Difference Between Primary Data and Secondary Data

  • Primary data are collected by the researcher himself, although the secondary data has been collected previously by other researcher.

  • Primary data are collected and used first time. However, on the secondary data, some decisions has been made previously, such decisions mayor may not be useful for the researcher now.

  • Since primary data is collected by the researcher himself, so it relates directly to the research objective or may be more close to research objective. How many parts of secondary data need not to be related to the research objective?


  • Since primary data is collected by the researcher, so it is more time taking activity for the researcher than to get the secondary data.

Leave a Reply