Removal of identifiable results
Identifiable results
At Lifelines we maintain the general rule that the minimal participant group size for which results can be exported or published is N = 10. This general rule is enforced to minimize the risk that individual participants can be recognized (by themselves or by third parties) based on the reported results, which in turn may lead to unwanted consequences (i.e. misuse of information for commercial or political reasons, or participants involuntary learning about personal health-related results).
Below we give some examples and instructions on how to deal with results that include group sizes of N<10. Please note: these examples have been made up and do not cover all possible options (e.g. textual inclusions).
Figures
For figures it is important to consider how identifying the values might be. Especially the depiction of outliers might risk the identification of our participants (see example figure). The example figure shows a scatterplot, however, the same applies for other figures like boxplots, histograms, or bar charts. The only solution to such an issue is to remove the outliers or the complete figure.
Tables
In the table shown on the right several of the presented results describe participant groups that are smaller than N = 10. To adhere to our guidelines, we will ask you to alter your table. Below we present several possible adjustment methods.
Solution 1: Aggregating your results
One of the possible solutions is to combine two (or more) small groups into one group. In our example this would entail combining age groups: combining the age groups <20 and 20-30 together, and the same for the age groups 41-50 and 51+.
Solution 2: replacing small results with <10
A second possible solution is to replace the small observations with "<10". This way, the reader and participants do not know the exact amount of participants presented. Make sure to also replace the zero values with <10. Important note: please make sure that the other group sizes mentioned in the table cannot be used to calculate the obscured group size!
In our example, this solution works well for the ex-smoker category, as there are several age groups with N<10. However, we are able to calculate the exact number for the current smoker category by substracting all other age groups from the total number in the header. There are two ways to resolve the issue with the current smoker category:
- Option 1: removing the total numbers by the smoking categories. As a reader you shouldn't be able to find the number somewhere else in the table either.
- Option 2: add uncertainty to an additional age group (in this example age group 20-30).
Solution 3: presenting percentages instead of absolute values
The third solution is to present results as percentages from the total group in your table, instead of exact group sizes. Important note: please carefully check on the possibility to calculate the exact group size.
In our example the exact numbers are still traceable as the total size of the smoker groups are relatively small. There are two ways to resolve this issue:
- Option 1: remove the total group sizes.
- Option 2: display the results as "smaller than" the percentage for a group size of N = 10.
Combining solution 2 and 3
You might want to include as much information as possible in your table. In our final example we combine two previous examples as shown in the table to the right.
Solution 4: leaving out a category
The final solution is to leave out categories that or not of interest to you. By doing so, the observations with "<10" are not traceable from the total minus the frequencies in the other categories.
In our example, you might be interested in younger and older participants only. As a result you could leave out the age groups 20-30 and 31-40. The same principle would apply when your categories are yes and no (leave out the no or yes so that the exact number cannot be deduced).
Frequently Asked Questions (FAQs)
-
Yes you are allowed to include graphs/figures in your paper (e.g. boxplot, scatterplot, histogram). Please be aware of identifiable outliers as shown in the figure example on this page.
-
Our publication rule applies to all inclusions of small groups of participants, no matter the context. As a result, even when creating a table/text to describe your participants, you will have to follow the general publication rule.
-
In general we use our standard guidelines to review your manuscript before submission. However, for some research topics you might only have a small subsample to start with. In these instances we will consider whether we can make an exception for your manuscript. Make sure you consider the validity of results with such small numbers.
Please include at least the points listed below when submitting your manuscript. Lifelines will then evaluate the request.
- That you do not meet the publication rule N>=10 (please specify all the tables/figures/sections in your manuscript in which this is the case)
- Why you are unable to remove each small group size
- Description of all selection criteria, i.e. characteristics that participants need to have in order to be part of the group
- The result itself: is this something Lifelines participants are likely to know about themselves or not?