Abstract
Proper sample selection is based on the study purpose, research question(s), and study design. Investigators must use care to select a sample population that is representative of the source and target populations. Well-defined inclusion and exclusion criteria serve as guidance when screening potential candidates for eligibility for participation in a study. Sampling and non-sampling errors may influence study outcomes and generalizability of results. The purpose of this short report is to review common sampling errors made when designing a study and when reporting study outcomes.
CASE STUDY
An investigator wishes to study how practicing dental hygienists use social media to promote daily oral hygiene self-care behaviors in their patients. The investigator creates an electronic survey that contains demographic questions that ask participants about their age, year of graduation from dental hygiene school, current state of residence, and number of years in practice. Other questions ask about whether the dental hygienist uses social media to communicate with patients, which platforms are used, if/how the importance of daily oral hygiene is discussed, and whether their patients like receiving posts about the importance of daily oral hygiene on social media.
The investigator wants to collect responses from a national sample of practicing dental hygienists. To determine how many dental hygienists are employed in the United States, the investigator uses data from the U.S. Bureau of Labor Statistics which indicates that there were 214,700 dental hygienists employed in the U.S. as of May 2022.1 The investigator decides to use social media to access dental hygienists from across the country. A brief announcement about the survey is developed to describe the purpose of the study, which includes a link to the anonymous survey. Potential participants are informed that clicking on the link to access and complete the survey indicates their consent to participate. The investigator then posts the announcement to several social media platforms, and within a few days, receives 23,738 responses. After two weeks, no additional responses are collected, so the survey is closed.
In a related manuscript submitted for publication, the investigator reports an 11% response rate to the survey. Reported demographic data reflects respondents from 17 states, most of whom have graduated within the past 15 years, with an average of 10 years in practice and mean age of 30.7 years. Key findings were that 63% of respondents reported using social media to communicate with patients across a variety of social media platforms, with most posting messages about the importance of daily brushing and flossing. Almost all respondents (89%) stated that their patients liked receiving these social media posts. The investigator concludes that practicing dental hygienists use social media to communicate with their patients and that using social media is a valuable tool to reinforce patients’ daily oral self-care behaviors.
What errors did the investigator make in designing the study and reporting the study findings?
Sampling and Reporting Errors in the Case Study
In the present study, the target population would be practicing dental hygienists. The source population would be currently practicing dental hygienists who also use social media for patient communication, as only these individuals would have patients with whom they can interact with on social media. However, the investigator did not confirm eligibility to participate among those who responded to the survey. The survey question asked about the number of years in practice, not whether the respondent was currently practicing, an important criterion. To correct this error, the electronic survey should have included a question about whether the individual was currently practicing, with built in skip logic that would end study participation for those who indicated that they were not practicing. It is unknown if or how many respondents were dental hygienists or were dental hygienists who were not practicing. In this study, participants from the sample population may not have been representative of the source population.
The investigator reports an 11% response rate. This response rate was calculated by dividing 214,700 dental hygienists by 23,738 respondents. However, it is impossible for the investigator to calculate the actual response rate, as it is unknown how many of the 214,700 dental hygienists saw the announcement on social media inviting them to participate. Further, it is also unknown how many dental hygienists saw the announcement and chose not to participate, a form of non-response bias. Those who did not respond may have very different opinions about using social media from those who chose to respond. In the manuscript, the investigator should have simply stated the number of respondents to the survey. It is important to note that many reputable journals will not accept studies with a low response rate as they lack generalizability to the target population.
Participants were from 17 states, which depending upon where the states are located, may reflect a broad geographic representation, but does not adequately reflect a national sample. Participants were also younger in age with just over a decade of practice experience. Arguably, it is logical to assume that younger dental hygienists may be more active on social media and therefore, were more likely to see and respond to the invitation to participate in the study. They may also be more likely to use social media as a method to communicate with their patients. Voluntary response bias may exist among the respondents, who may have similar feelings about the topic which motivated their participation, and which may lead to over-reporting of the importance of study findings.
There are other issues that may have biased the outcomes of the study. First, the investigator should have included questions that characterized the participants’ social media engagement in general, assessing their perceptions about the value of receiving health messaging through this medium. Those who like receiving health information via social media may be more likely to adopt this strategy with their patients. They may feel strongly about the value and benefits of using social media or they may be highly engaged with using social media for other types of office communications, all of which may influence their attitudes and beliefs, posing risk for response bias.
Conclusions drawn by the investigator overstated the actual findings from the study. This is a common reporting mistake which can be easily prevented by forming conclusions that are specific to the study results only. For example, because 63% of the respondents reported using social media to communicate with patients, the investigator could have concluded that, “In this study, two-thirds of dental hygienist respondents used social media to communicate messaging about oral health to their patients”. Although most respondents (89%) believed that their patients like receiving these messages, this data reflects the opinions of the respondents only. The investigator overstates that using social media for health messaging is a valuable tool for reinforcing a patient’s daily oral self-care behaviors. The only way to know whether this is true would be to assess patient behaviors before and after exposure to health messaging via this communication strategy. To correct this error, the investigator could have simply concluded that, “The majority of respondents believe that their patients like receiving oral health messaging using social media”.
Common Sampling Errors
Sampling refers to selecting the people who will participate in a study and from whom data is collected. Sampling bias occurs when an investigator knows that the sample population is not representative of the target population. Sampling errors are not the events associated with bias or selecting the wrong people for a study. Rather, sampling error is a statistical error that reflects the difference between the mean values of the sample and the mean values of the entire population.2 Sampling error cannot be avoided, as a sample population can never completely represent the broad population: some variation in characteristics will inevitably exist. Standard error is how sampling error is measured and reflects the amount that the sample mean differs from the population mean – a type of standard deviation. Sampling errors are associated with variations in representativeness of the sample that responds. Sampling errors can be minimized by having an adequate sample size, using due diligence to obtain an adequate number of respondents (e.g., in surveys, contacting potential respondents several times with reminders to complete the survey), and by using good study design.2
Non-sampling errors occur with poor study design, incorrect sampling methods, or with the way that information is gathered, recorded, tabulated or analyzed. Non-sampling errors are not related to sample size. With survey research, question design (e.g., writing leading questions or limiting response choices) and the order of the questions can lead to these types of errors. Examples include:2,3
Population-specific errors occur when the investigator does not know who should be included in the study. Subject selection should be based on the research question.
Sample-frame error reflects when members of the sample population are selected from the wrong source population. Participants do not represent the population of interest or representative participants are not included.
Selection errors occur when participants are not chosen at random and are all volunteers. This type of error is common with convenience sampling.
Non-response error, a type of response bias, reflects differences in characteristics between respondents and non-respondents. This type of error occurs when potential subjects are not contacted, and thus, do not have the chance to respond, or when they choose not to respond.
Undercover error happens with disparities among respondents. For example, investigators who design surveys must be careful to ensure access to the survey by respondents of different ages, race and ethnicities, geographic location, etc. to ensure representativeness.3
Convenience sampling error is a risk because by their nature, convenience samples are selected due to ease of access versus with random selection. They are usually not representative of the source or target populations, which limits the generalizability of results from studies that use this type of methodology.3
Researcher bias leads to sampling errors when the investigator selects who will be in the sample, which is influenced by their personal bias. Selected participants may differ on demographics, personal experiences or opinions that may all influence the results. Results may not accurately reflect the phenomenon under study and cannot be generalized. Bias can be reduced by using random sampling instead.3
Non-sampling errors may also occur with measurement errors, data analysis errors, or response errors (e.g., when respondent answers are inaccurate, misinterpreted or documented incorrectly).3
CONCLUSION
Sampling errors are associated with the lack of representativeness of the sample population. Non-sampling errors occur because of poor study design, implementation and when various forms of bias are introduced during subject selection. Sampling errors can be reduced by increasing sample size. Randomization is a helpful strategy to reduce bias and non-sampling errors. Investigators should play careful attention to potential bias that can influence study results and take corrective measures, as appropriate, to improve study design to enhance the quality of their research.
Footnotes
NDHRA priority area, Professional development: Education (evaluation).
DISCLOSURES
The author has no conflicts of interest to disclose.
- Received July 1, 2023.
- Accepted July 27, 2023.
- Copyright © 2023 The American Dental Hygienists’ Association