Title: | 'Pubmed' Toolkit |
---|---|
Description: | Provides various functions for retrieving and interpreting information from 'Pubmed' via the API, <https://www.ncbi.nlm.nih.gov/home/develop/api/>. |
Authors: | Benjamin Gregory Carlisle [aut, cre]
|
Maintainer: | Benjamin Gregory Carlisle <[email protected]> |
License: | AGPL (>= 3) |
Version: | 1.0.4 |
Built: | 2025-02-12 03:25:00 UTC |
Source: | https://github.com/bgcarlisle/pubmedtk |
Downloads metadata from Pubmed API for a single provided PMID and exports
get_metadata_from_one_pmid(pmid, api_key)
get_metadata_from_one_pmid(pmid, api_key)
pmid |
A single PMID, e.g. "29559429" |
api_key |
A valid Pubmed API key |
A named list with 7 elements:
`$pubmed_dl_success`, which is TRUE in the case that a corresponding Pubmed record was found and metadata downloaded and FALSE otherwise. `$doi`, a character string containing the DOI for the publication with the PMID in question. `$languages`, a list of languages corresponding to the publication with the PMID in question. `$pubtypes`, a list of publication types corresponding to the publication with the PMID in question. `$pubdate`, the listed publication date `$epubdate`, the listed e-publication date `$authors`, a list of authors of the publication with the PMID in question. `$abstract`, a character string containing the abstract for the publication with the PMID in question.
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Download Pubmed metadata mdata <- get_metadata_from_one_pmid("29559429", ak) ## Extract first author mdata$authors[1] ## End(Not run)
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Download Pubmed metadata mdata <- get_metadata_from_one_pmid("29559429", ak) ## Extract first author mdata$authors[1] ## End(Not run)
Downloads metadata from Pubmed API for a column of PMID's in a data frame
get_metadata_from_pmids(df, column, api_key, quiet = FALSE)
get_metadata_from_pmids(df, column, api_key, quiet = FALSE)
df |
A dataframe containing a column of PMID's |
column |
The name of the column containing PMID's |
api_key |
A valid Pubmed API key |
quiet |
A boolean TRUE or FALSE. If TRUE, no progress messages will be printed during download. FALSE by default, messages printed for every version downloaded showing progress. |
A data frame containing the original columns as well as seven additional columns:
The `pubmed_dl_success` column is TRUE in the case that metadata were successfully downloaded from Pubmed; FALSE in the case that an error occurred during downloading (e.g. due to a number that is well-formed but does not correspond to a true PMID); NA in the case that the supplied PMID is not well-formed (e.g. NA or non-numeric). The `doi` column returns a DOI that corresponds to the PMID supplied if one is found, NA otherwise. The `languages` column contains a JSON-encoded list of languages for the article in question. The `pubtypes` column contains a JSON-encoded list of publication types for the article in question. The `pubdate` column contains a character string with the publication date The `epubdate` column contains a character string with the e-publication date The `authors` column contains a JSON-encoded list of authors for the article in question. The `abstract` column contains a character string with the abstract for the article in question.
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Example publications and their corresponding PMID's (some valid ## and some not) pubs <- tibble::tribble( ~pmid, "29559429", "28837722", NA, "borp", "98472657638729" ) ## Download Pubmed metadata pm_meta <- get_metadata_from_pmids(pubs, "pmid", ak) ## Extract DOI's for those that were successfully downloaded pm_meta %>% dplyr::filter(pubmed_dl_success) dplyr::select(pmid, doi) ## A tibble: 2 × 2 ## pmid doi ## <chr> <chr> ## 1 29559429 10.1136/bmj.k959 ## 2 28837722 10.1001/jama.2017.11502 ## End(Not run)
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Example publications and their corresponding PMID's (some valid ## and some not) pubs <- tibble::tribble( ~pmid, "29559429", "28837722", NA, "borp", "98472657638729" ) ## Download Pubmed metadata pm_meta <- get_metadata_from_pmids(pubs, "pmid", ak) ## Extract DOI's for those that were successfully downloaded pm_meta %>% dplyr::filter(pubmed_dl_success) dplyr::select(pmid, doi) ## A tibble: 2 × 2 ## pmid doi ## <chr> <chr> ## 1 29559429 10.1136/bmj.k959 ## 2 28837722 10.1001/jama.2017.11502 ## End(Not run)
Returns a list of PMID's for a provided Pubmed search
get_pmids_from_one_search(query, api_key)
get_pmids_from_one_search(query, api_key)
query |
A Pubmed search query |
api_key |
A valid Pubmed API key |
A named list with 3 elements:
`$pubmed_search_success`, which is TRUE in the case that the provided query was searched successfully on Pubmed and FALSE otherwise. `$n_results`, the number of results for the search as reported by Pubmed `$pmids`, a list of PMID's corresponding to the Pubmed search results for the query provided
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Download PMID's for search query results <- get_pmids_from_one_search("Carlisle B[Author]", ak) ## Extract first result results$pmids[1] ## End(Not run)
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Download PMID's for search query results <- get_pmids_from_one_search("Carlisle B[Author]", ak) ## Extract first result results$pmids[1] ## End(Not run)
Downloads PMID results for a column of Pubmed search queries in a data frame
get_pmids_from_searches(df, column, api_key, quiet = FALSE)
get_pmids_from_searches(df, column, api_key, quiet = FALSE)
df |
A dataframe containing a column of Pubmed search
queries. This data frame cannot have columns with the following
names: |
column |
The name of the column containing Pubmed search queries |
api_key |
A valid Pubmed API key |
quiet |
A boolean TRUE or FALSE. If TRUE, no progress messages will be printed during download. FALSE by default, messages printed for every version downloaded showing progress. |
A data frame containing the original columns as well as three additional columns:
The `pubmed_search_success` column is TRUE in the case that the search rcesults were successfully obtained from Pubmed; FALSE in the case that an error occurred in search (e.g. due to a search query that is not well-formed). The `n_results` column contains the number of research results for the query provided. The `pmids` column returns a JSON-encoded list of PMID's for the search query provided.
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Example Pubmed searches, some valid, some not, some with more ## than 10k results searches <- tribble( ~terms, "Carlisle B[Author]", "NCT00267865", "(Clinical Trial[Publication Type])", "" ) ## Download search results results <- get_pmids_from_searches(searches, "terms", ak) ## End(Not run)
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Example Pubmed searches, some valid, some not, some with more ## than 10k results searches <- tribble( ~terms, "Carlisle B[Author]", "NCT00267865", "(Clinical Trial[Publication Type])", "" ) ## Download search results results <- get_pmids_from_searches(searches, "terms", ak) ## End(Not run)
Checks a column of PMID's for whether or not they would appear in a Pubmed search result.
intersection_check( df, column, query, api_key, batch_size = 1000, quiet = FALSE )
intersection_check( df, column, query, api_key, batch_size = 1000, quiet = FALSE )
df |
A dataframe containing a column of PMID's |
column |
The name of the column containing PMID's |
query |
A character string containing a valid Pubmed search query |
api_key |
A valid Pubmed API key |
batch_size |
An integer greater than 0 and less than 10000 |
quiet |
A boolean TRUE or FALSE. If TRUE, no progress messages will be printed during download. FALSE by default. |
A data frame containing the original columns, as well as
two additional ones: pm_checked
and found_in_pm_query
.
The new `pm_checked` column is TRUE if Pubmed was successfully queried and NA if Pubmed was not checked for that PMID (this may occur in cases where the PMID to be checked is not well-formed). The new `found_in_pm_query` column is TRUE if the PMID in question would appear in a search of Pubmed defined by the query provided; FALSE if it would not appear in such a search and NA if the PMID in question was not checked (this may occur in cases where the PMID is not well-formed).
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Example publications and their corresponding PMID's (some valid ## and some not) pubs <- tibble::tribble( ~pmid, "29559429", "28837722", "28961465", "32278621", "one hundred of them", "28837722", "28961465" ) ## Check which ones were authored by Carlisle: intersection_check(pubs, "pmid", "Carlisle[Author]", ak) ## End(Not run)
## Not run: ## Read in API key ak <- readLines("api_key.txt") ## Example publications and their corresponding PMID's (some valid ## and some not) pubs <- tibble::tribble( ~pmid, "29559429", "28837722", "28961465", "32278621", "one hundred of them", "28837722", "28961465" ) ## Check which ones were authored by Carlisle: intersection_check(pubs, "pmid", "Carlisle[Author]", ak) ## End(Not run)