Cell data guidelines
Introduction
The following guidelines detail the form fields found on the cell upload page and cell editing pages. Like the upload form, related sections are grouped near each other in this guide, and are in the same order as on the upload form. Under each section is a description of the input field, an example of what kind of information we want for that field, and standard conventions that should be followed for that piece of information.
Following these guidelines are required when uploading cell data to our database. This helps keep our data consistent and accurate.
Cell/subset information
The cell/subset information section describes basic information regarding the cell data being added to the database.
Cell/subset name
Required - The identifier or name best used to identify the cell being added to the database.
Example: Hematopoietic stem cells
Conventions:
- Should use the most accepted name used in the literature.
- Should not contain abbreviations or acronyms. These should be provided in the Alternative name and Alias fields.
- If the name is hotly debated or is contested, provide the name referenced earliest in literature and provide the other accepted name in the Alternative name field. Then list all other names as Aliases.
- Should be plural.
- Can contain most alphanumeric and Greek characters.
- Should be less than 40 characters.
Alternative name
Optional - Alternative name, abbreviation, or acronym used to describe the cell type or subset in addition to the primary name.
Example: HSC
Conventions:
- Should be used for abbreviations or acronyms that supplement the Primary Name. If a cell type or subset has another widely accepted primary name it should go here. Provide abbreviations as Aliases in that case.
- Should list an other widely accepted name if the Primary Name is hotly debated or contested. Provide abbreviations as Aliases in that case.
- Should be singular if an abbreviation or acronym and plural if an alternative full name.
- Should not contain parenthesis as these are added automatically.
- Can contain most alphanumeric and Greek characters.
- Should be less than 40 characters.
Other aliases
Optional - Other names, abbreviations, or acronyms that describe the cell type or subset.
Repeatable - This field is repeatable and numerous entries can be added.
Examples: HSCP, Hematopoietic progenitor cells
Conventions:
- Should contain other widely accepted or often used names, abbreviations, or acronyms.
- Should include mouse and human aliases where possible, if they differ.
- Should be singular if an abbreviation or acronym and plural if an Alternative Primary Name.
- Should not contain parenthesis as these are added automatically.
- Can contain most alphanumeric and Greek characters.
- Should be less than 40 characters.
Lineage information
The lineage information section is used to classify the cell type or subset being added to the database. Lineages are curated by staff. If a lineage is missing, please contact staff to have it added.
Lineage name
Required - The lineage that best describes the cell type or subset being added to the database. The most specific lineage should be used.
Note: the lineage needs to be added to the database by staff prior to adding the new cell. Please contact staff to have a new lineage added.
Autocomplete - This field will autocomplete lineage entries from the database.
Example: definitive hematopoietic
Conventions:
- Use the most accepted lineage in the literature and the most specific lineage possible. e.g. "lymphocyte" rather than "hematopoietic" for describing "T cells". However, this may not be clearly defined for each cell type. In that case, use your best judgment and prefer a more general lineage.
Marker information
The marker information section describes the cell markers that identify the cell type or subset being added to the database. The marker upload form can be used to add marker information to the database if it does not already exist.
Repeatable - This field is repeatable and numerous entries can be added.
Marker name
Required - The marker name that best describes the marker being added to the cell type or subset.
Note: the marker needs to be added to the database prior to adding the new cell. Please use the marker upload form to add the marker.
Autocomplete - This field will autocomplete marker entries from the database.
Example: KIT (CD117)
Conventions:
- Should use widely accepted or well characterized markers found in the literature. Take care to ensure you are using the correct variant for markers that have multiple isoforms.
- Negative markers (see below) should be used sparingly and only when they would aid in cell identification. The focus should be kept on markers with positive expression.
- Should not add duplicate markers to an entry. Ensure you do not add a marker twice.
- If "lineage dump" is selected, specific details should be described in detail in the Marker Strategy section.
Marker expression
Required - Expression level that best describes the marker being added to the cell type or subset.
Example: ++
Conventions:
- Should be the widely accepted or well characterized expression level for that cell type. If expression is unclear, debated, or controversial, select "N/A". If expression is fluid or has a range, select an intermediate level such as "+/++". See below for more information:
- "-" indicates the marker is not expressed and absent. Use in circumstances where lack of expression can be used to differentiate the cell type or subset from others cleanly.
- "+/-" indicates the marker has variable expression, that may or may not be present, but still can identify a cell/subset with other markers.
- "+" indicates the marker is expressed. Typically used to indicate "normal" levels of expression when compared to similar cells. Use in circumstances where the marker can be used to differentiate the cell type or subset from others cleanly.
- "+/++" indicates the marker has normal to medium levels of expression compared to similar cell types or can range between normal and medium levels in certain contexts.
- "++" indicates the marker has medium levels of expression compared to similar cell types and is greater than normal expression (+).
- "++/+++" indicates the marker has medium to high levels of expression compared to similar cell types or can range between medium and high levels in certain contexts.
- "+++" indicates the marker has high levels of expression compared to similar cell types and is greater than medium expression (++) and well above normal expression (+).
- "N/A" indicates the marker expression is unknown but can likely be used for cell identification based on other evidence. Can also be used to indicate that expression levels are debated, changing, or controversial.
- Markers that have highly debated or controversial expression should be labeled as "N/A". If this is the case, please document the controversial information in the Marker Notes section.
Marker citations
Optional, but highly recommended - Evidence to support marker expression for the cell type or subset being added.
Repeatable - This field is repeatable and numerous entries can be added.
Example: https://pubmed.ncbi.nlm.nih.gov/########/
Conventions:
- Should provide sources that are relevant for the marker being added. These sources should clearly document expression for the cell type or subset being added.
- Cite original literature for unique or newly described markers whenever possible.
- Reviews are acceptable for widely accepted markers.
- Should prefer quality over quantity for citations. Original literature describing the marker should be enough. In cases of highly debated or contested markers, cite both supporting and opposing literature.
- Should prefer PubMed links over links to specific journals or resources.
- Should be a valid URL.
Marker recommendation
Optional - Specify whether the marker is "required" or highly "suggested" for identifying/differentiating the cell type or subset from other cells.
Example: Suggested
Conventions:
- "None" is the default recommendation and should be used for most general markers.
- "Required" indicates the marker is absolutely required for the identification of the cell type or subset being added. Required markers could be thought of as a small set of core markers that delineate a cell type from others. e.g. CD19 would be a good "Required" marker for B cells.
- "Suggested" indicates the marker is important for the identification of the cell type or subset being added, but is not absolutely required if other markers are used. e.g. In mice, B220 would be a good "Suggested" marker for B cells in combination with CD19.
Antibody clone
Optional - Specify an antibody clone if it is required or suggested for identifying the cell type or subset.
Example: ABC123
Conventions:
- Listing the clone is only recommended if it matters for the identification of a specific cell/subset. Do not list clones for each marker if they are not important.
- Expand on the rationale for using a specific clone in the Marker Strategy section.
- Can contain most alphanumeric characters.
- Should be less than 20 characters.
Marker species
Optional - Specify whether the marker is expressed only in mouse or humans. Often used along with Recommendation to denote a marker that is required for cell type or subset identification in a particular species.
Example: Mouse
Conventions:
- "None" is the default and is ideal for most markers.
- "Mouse" indicates the cell marker is expressed only in mice.
- "Human" indicates the cell marker is expressed only in humans.
Marker localization
Optional - Specify the localization of the antigen an antibody is specific for.
Example: Mitochondrial
Conventions:
- "None" is the default and is ideal for most surface markers.
- "Surface" indicates the cell marker is expressed on the surface of the cell. Only specify if there may be confusion regarding the localization of a particular marker.
- "Nuclear" indicates the cell marker is expressed in the nucleus. This should be selected when adding any nuclear proteins, especially transcription factors.
- "Intracellular" indicates the cell marker has intracellular expression. Do not use this for nuclear proteins. Only specify if there may be confusion regarding the localization of a particular marker. e.g. an antibody recognizes an intracellular domain of a surface protein.
- "Mitochondrial" indicates the cell marker is expressed on or in the mitochondria. This should be selected when adding mitochondrial proteins or if there may be confusion regarding the localization of a particular marker. e.g. BCL2 localization to the mitochondrial membrane.
- "Secreted" indicates the cell marker is produced and secreted from the cell. e.g. cytokine production after cell stimulation.
Marker analysis strategy
Optional - Long-form text field that provides information regarding identification strategies for a specific marker.
Example: Stain this marker at 37C for 15 minutes.
Conventions:
- Should accurately describe how to use or analyze difficult markers.
- If a specific antibody clone is required to identify a cell population, more details can be provided here. e.g. Analysis of murine pro-B cell populations with anti-CD24 should use the 30-F1 clone.
- If a lineage dump was selected for a marker, specific markers that compose the lineage dump need to be listed in detail here. e.g. The following lineage dump is required to enrich HSCs: CD3, CD4, CD8, CD19, B220, GR-1, CD11b, CD11c, NK1.1.
- Can be used to elaborate on the usage of the provided marker.
- Can contain most alphanumeric and Greek characters.
- Should be less than 100 characters.
Marker notes
Optional - Long-form text field that provides other information regarding a specific marker.
Example: Identification of X cells using this marker often works best when including marker Y.
Conventions:
- Should provide potentially useful information about a particular marker. This field should be used minimally.
- Can contain most alphanumeric and Greek characters.
- Should be less than 100 characters.
Cell/subset description
The cell/subset description section provides relevant descriptive information regarding the cell type or subset being added to the database.
Description
Required - Long-form text field that provides descriptive information regarding the cell type or subset.
Example: Hematopoietic stem cells (HSCs) are a type of stem cell that gives rise to other blood cells. Hematopoiesis is the process by which HSCs give rise to mature blood cells. HSCs give rise to multipotent, oligopotent, and unipotent progenitors that eventually give rise to adult blood cells in more specified hematopoietic lineages. HSCs are capable of self-renewal. Transplantation of HSCs is used in the treatment of cancer and immune system disorders.
Conventions:
- Used to provide useful or factual information regarding the cell type/subset, such as the frequency of the cell type found in a specific tissue.
- Can utilize acronyms in this section, as long as they are used consistently.
- Can utilize protein/gene symbols, as long as they follow the nomenclature and rules outlined in the marker information guide.
Other notes
Optional - Long-form text field for informational notes regarding the cell type or subset being added.
Example: In humans, hematopoiesis regulates HSC production of more than 500 billion blood cells each day.
Conventions:
- Should provide potentially useful information or facts, such as a description of the overall function of a cell lineage.
- Should be used to describe why certain markers may be hotly debated or contested.
External resources
The external resources section provides a mechanism to link a cell type or subset to other relevant databases or resources.
Optional - Fields in this section are optional, but recommended.
Wikipedia link
Optional - Used to link the cell type or subset to a Wikipedia page.
Example: https://en.wikipedia.org/wiki/Hematopoietic_stem_cell
Conventions:
- Should only include a valid URL for a relevant Wikipedia page.
- Should use https links whenever possible.
External databases
Optional - Used to link the cell type or subset to other relevant external databases. Details described below.
Repeatable - This field is repeatable and numerous entries can be added.
Conventions:
- Should include duplicate entries for human and mouse pages on external databases, if that database has separate pages for them. Label them as described in the Database ID section below.
Database name
Required if adding databases - Name of external database being linked.
Example: HSC-Explorer
Conventions:
- Should prefer using an abbreviation or acronym of the database resource, if possible.
- Can contain most alphanumeric and Greek characters.
- Should be 25 characters or less.
Database ID
Optional - External ID used to identify cell type or subset. This is only provided here for convenience and is not required.
Example: A12345
Conventions:
- Should match the format provided on the external database. e.g. if the marker can be found at http://some-database.com/A12345, then "A12345" should be provided, unless the data page provides a different unique entry id.
- If distinguishing between human and mouse database links, append either "(human)" or "(mouse)" to the end of the ID. Be sure to separate the text from the ID with a space.
- Can contain both letters and numbers.
- Should be 40 characters or less.
Database link
Required if adding databases - URL pointing to external database being linked. Required if providing an external database.
Example: http://mips.helmholtz-muenchen.de/HSC/
Conventions:
- Should include the full URL used to browse to the relevant database entry.
- Should be a valid URL.
- Should use https links whenever possible.
External links
Optional - Used to link the cell type or subset to other relevant external websites. Details described below.
Repeatable - This field is repeatable and numerous entries can be added.
Link name
Required if adding links - Name of external link being linked.
Example: HSC-Explorer
Conventions:
- Should prefer using an abbreviation or acronym of the resource, if possible.
- Can contain most alphanumeric and Greek characters.
- Should be 25 characters or less.
Link URL
Required if adding links - URL pointing to external link being linked. Required if providing an external link.
Example: https://www.cancer.gov/publications/dictionaries/cancer-terms/def/hematopoietic-stem-cell
Conventions:
- Should include the full URL used to browse to the relevant resource.
- Should be a valid URL.
- Should use https links whenever possible.