| Title: | Explore 'Wikidata' Through Tidy Data Frames |
|---|---|
| Description: | Query 'Wikidata' API <https://www.wikidata.org/wiki/Wikidata:Main_Page> with ease, get tidy data frames in response, and cache data in a local database. |
| Authors: | Giorgio Comai [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-0515-9542>), EDJNet [fnd] |
| Maintainer: | Giorgio Comai <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.6.2 |
| Built: | 2026-06-05 06:19:10 UTC |
| Source: | https://github.com/edjnet/tidywikidatar |
Mostly used internally in functions, exported for reference.
tw_check_cache(cache = NULL)tw_check_cache(cache = NULL)
cache |
Defaults to |
Either TRUE or FALSE, depending on current cache settings.
if (interactive()) { tw_check_cache() }if (interactive()) { tw_check_cache() }
Checks if cache folder exists, if not returns an informative message
tw_check_cache_folder()tw_check_cache_folder()
If the cache folder exists, returns TRUE. Otherwise throws an
error.
# If cache folder does not exist, it throws an error tryCatch(tw_check_cache_folder(), error = function(e) { return(e) } ) # Create cache folder tw_set_cache_folder(path = fs::path( tempdir(), "tw_cache_folder" )) tw_create_cache_folder(ask = FALSE) tw_check_cache_folder()# If cache folder does not exist, it throws an error tryCatch(tw_check_cache_folder(), error = function(e) { return(e) } ) # Create cache folder tw_set_cache_folder(path = fs::path( tempdir(), "tw_cache_folder" )) tw_create_cache_folder(ask = FALSE) tw_check_cache_folder()
Tested only with SQLite and MySql. May work with other drivers. Used to check
if given cache table is indexed (if created with any version of
tidywikidatar before 0.6, they are probably not indexed and less
efficient).
tw_check_cache_index( table_name = NULL, type = "item", show_details = FALSE, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_check_cache_index( table_name = NULL, type = "item", show_details = FALSE, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
table_name |
Name of the table in the database. If given, it takes precedence over other parameters. |
type |
Defaults to "item". Type of cache file to output. Values
typically used by |
show_details |
Logical, defaults to |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If show_details is set to FALSE, return a logical vector of
length one (TRUE if the table was indexed, FALSE if it was not). If
show_details is set to TRUE, returns a data frame with more details
about the index.
if (interactive()) { tw_enable_cache() tw_set_cache_folder(path = fs::path( fs::path_home_r(), "R", "tw_data" )) tw_set_language(language = "en") tw_check_cache_index() }if (interactive()) { tw_enable_cache() tw_set_cache_folder(path = fs::path( fs::path_home_r(), "R", "tw_data" )) tw_set_language(language = "en") tw_check_cache_index() }
Check if given items are present in cache
tw_check_cached_items( id, language = tidywikidatar::tw_get_language(), cache_connection = NULL, disconnect_db = TRUE )tw_check_cached_items( id, language = tidywikidatar::tw_get_language(), cache_connection = NULL, disconnect_db = TRUE )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
language |
Defaults to language set with |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A character vector with IDs of items present in cache. If no item
found in cache, returns NULL.
if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) # add three items to local cache invisible(tw_get(id = "Q180099", language = "en")) invisible(tw_get(id = "Q228822", language = "en")) invisible(tw_get(id = "Q184992", language = "en")) # check if these other items are in cache items_in_cache <- tw_check_cached_items( id = c( "Q180099", "Q228822", "Q76857" ), language = "en" ) # it should return only the two items from the current list of id # but not other item already in cache items_in_cache }if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) # add three items to local cache invisible(tw_get(id = "Q180099", language = "en")) invisible(tw_get(id = "Q228822", language = "en")) invisible(tw_get(id = "Q184992", language = "en")) # check if these other items are in cache items_in_cache <- tw_check_cached_items( id = c( "Q180099", "Q228822", "Q76857" ), language = "en" ) # it should return only the two items from the current list of id # but not other item already in cache items_in_cache }
Mostly used internally by other functions.
tw_check_pid(property, logical_vector = FALSE, non_pid_as_NA = FALSE)tw_check_pid(property, logical_vector = FALSE, non_pid_as_NA = FALSE)
property |
A character vector of one or more Wikidata property identifiers. |
logical_vector |
Logical, defaults to |
non_pid_as_NA |
Logical, defaults to |
A character vector with only strings appearing to be Wikidata identifiers; possibly shorter than input.
tw_check_pid(property = c("P19", "p20", "Not an property id", "20", NA, "Q5", "")) tw_check_pid( property = c("P19", "p20", "Not an property id", "20", NA, "Q5", ""), logical_vector = TRUE ) tw_check_pid( property = c("P19", "p20", "Not an property id", "20", NA, "Q5", ""), non_pid_as_NA = TRUE )tw_check_pid(property = c("P19", "p20", "Not an property id", "20", NA, "Q5", "")) tw_check_pid( property = c("P19", "p20", "Not an property id", "20", NA, "Q5", ""), logical_vector = TRUE ) tw_check_pid( property = c("P19", "p20", "Not an property id", "20", NA, "Q5", ""), non_pid_as_NA = TRUE )
Mostly used internally by other functions.
tw_check_qid(id, logical_vector = FALSE, non_id_as_NA = FALSE)tw_check_qid(id, logical_vector = FALSE, non_id_as_NA = FALSE)
id |
A character vector of one or more Wikidata id. |
logical_vector |
Logical, defaults to |
non_id_as_NA |
Logical, defaults to |
A character vector with only strings appearing to be Wikidata identifiers; possibly shorter than input.
tw_check_qid(id = c("Q180099", "q228822", "Not an id", "00180099", NA, "Q5")) tw_check_qid( id = c("Q180099", "q228822", "Not an id", "00180099", NA, "Q5"), logical_vector = TRUE ) tw_check_qid( id = c("Q180099", "q228822", "Not an id", "00180099", NA, "Q5"), non_id_as_NA = TRUE )tw_check_qid(id = c("Q180099", "q228822", "Not an id", "00180099", NA, "Q5")) tw_check_qid( id = c("Q180099", "q228822", "Not an id", "00180099", NA, "Q5"), logical_vector = TRUE ) tw_check_qid( id = c("Q180099", "q228822", "Not an id", "00180099", NA, "Q5"), non_id_as_NA = TRUE )
Mostly used as a convenience function inside other functions to have consistent inputs.
tw_check_search( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_check_search( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A string to be searched in Wikidata |
type |
Defaults to "item". Either "item" or "property". |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
include_search |
Logical, defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A data frame with three columns, id, label, and description,
filtered by the above criteria. Four columns if include_search is set to
TRUE.
# The following two lines should give the same result. ## Not run: tw_check_search("Sylvia Pankhurst") tw_check_search(tw_search("Sylvia Pankhurst")) ## End(Not run)# The following two lines should give the same result. ## Not run: tw_check_search("Sylvia Pankhurst") tw_check_search(tw_search("Sylvia Pankhurst")) ## End(Not run)
Return a connection to be used for caching
tw_connect_to_cache( connection = NULL, RSQLite = NULL, language = tidywikidatar::tw_get_language(), cache = NULL )tw_connect_to_cache( connection = NULL, RSQLite = NULL, language = tidywikidatar::tw_get_language(), cache = NULL )
connection |
Defaults to |
RSQLite |
Defaults to |
language |
Defaults to language set with |
cache |
Defaults to |
A connection object.
if (interactive()) { cache_connection <- pool::dbPool( RSQLite::SQLite(), # or e.g. odbc::odbc(), Driver = ":memory:", # or e.g. "MariaDB", Host = "localhost", database = "example_db", UID = "example_user", PWD = "example_pwd" ) tw_connect_to_cache(cache_connection) db_settings <- list( driver = "MySQL", host = "localhost", server = "localhost", port = 3306, database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) tw_connect_to_cache(db_settings) }if (interactive()) { cache_connection <- pool::dbPool( RSQLite::SQLite(), # or e.g. odbc::odbc(), Driver = ":memory:", # or e.g. "MariaDB", Host = "localhost", database = "example_db", UID = "example_user", PWD = "example_pwd" ) tw_connect_to_cache(cache_connection) db_settings <- list( driver = "MySQL", host = "localhost", server = "localhost", port = 3306, database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) tw_connect_to_cache(db_settings) }
tidywikidatar caches data.Creates the base cache folder where tidywikidatar caches data.
tw_create_cache_folder(ask = TRUE)tw_create_cache_folder(ask = TRUE)
ask |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_create_cache_folder() }if (interactive()) { tw_create_cache_folder() }
Disable caching for the current session
tw_disable_cache()tw_disable_cache()
Nothing, used for its side effects.
if (interactive()) { tw_disable_cache() }if (interactive()) { tw_disable_cache() }
Ensure that connection to cache is disconnected consistently
tw_disconnect_from_cache( cache = NULL, cache_connection = NULL, disconnect_db = TRUE, language = tidywikidatar::tw_get_language() )tw_disconnect_from_cache( cache = NULL, cache_connection = NULL, disconnect_db = TRUE, language = tidywikidatar::tw_get_language() )
cache |
Defaults to NULL. If given, it should be given either |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
language |
Defaults to language set with |
Nothing, used for its side effects.
if (interactive()) { tw_get( id = c("Q180099"), language = "en" ) tw_disconnect_from_cache() }if (interactive()) { tw_get( id = c("Q180099"), language = "en" ) tw_disconnect_from_cache() }
tw_get_image_metadata() would not return any value.A zero-rows tibble used internally when tw_get_image_metadata() would not return any value.
tw_empty_image_metadatatw_empty_image_metadata
A data frame with 0 rows and 19 columns
tw_get() would not return any value.A zero-rows tibble used internally when tw_get() would not return any value.
tw_empty_itemtw_empty_item
A data frame with 0 rows and 3 columns
tw_get_qualifiers() would not return any value.A zero-rows tibble used internally when tw_get_qualifiers() would not return any value.
tw_empty_qualifierstw_empty_qualifiers
A data frame with 0 rows and 8 columns
tw_search() would not return any value.A zero-rows tibble used internally when tw_search() would not return any value.
tw_empty_searchtw_empty_search
A data frame with 0 rows and 4 columns
tw_empty_wikipedia_category_members() would not return any value.A zero-rows tibble used internally when tw_empty_wikipedia_category_members() would not return any value.
tw_empty_wikipedia_category_memberstw_empty_wikipedia_category_members
A data frame with 0 rows and 3 columns
tw_get_wikipedia_page_qid() would not return any value.A zero-rows tibble used internally when tw_get_wikipedia_page_qid() would not return any value.
tw_empty_wikipedia_pagetw_empty_wikipedia_page
A data frame with 0 rows and 6 columns
tw_get_wikipedia_page_links() would not return any value.A zero-rows tibble used internally when tw_get_wikipedia_page_links() would not return any value.
tw_empty_wikipedia_page_linkstw_empty_wikipedia_page_links
A data frame with 0 rows and 8 columns
tw_get_wikipedia_page_sections() would not return any value.A zero-rows tibble used internally when tw_get_wikipedia_page_sections() would not return any value.
tw_empty_wikipedia_page_sectionstw_empty_wikipedia_page_sections
A data frame with 0 rows and 8 columns
Enable caching for the current session
tw_enable_cache(SQLite = TRUE)tw_enable_cache(SQLite = TRUE)
SQLite |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_enable_cache() }if (interactive()) { tw_enable_cache() }
tw_get_item()
This function is mostly used internally and for testing.
tw_extract_qualifier( id, p, w = NULL, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_extract_qualifier( id, p, w = NULL, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
w |
A list, typically created with |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A data frame (a tibble) with eight columns: id for the input id,
property, qualifier_id, qualifier_property, qualifier_value,
rank, qualifier_value_type, and set (to distinguish sets of data when
a property is present more than once)
# w <- tw_get_item(id = "Q180099") tw_extract_qualifier(id = "Q180099", p = "P26", w = list(tw_test_items[["Q180099"]]))# w <- tw_get_item(id = "Q180099") tw_extract_qualifier(id = "Q180099", p = "P26", w = list(tw_test_items[["Q180099"]]))
This function is mostly used internally and for testing.
tw_extract_single(w, language = tidywikidatar::tw_get_language())tw_extract_single(w, language = tidywikidatar::tw_get_language())
w |
A list, typically retrieved with |
language |
Defaults to language set with |
A data frame (a tibble) with four columns, such as the one created by
tw_get().
#' Retrieving from tests, but normally: # w <- tw_get_item(id = "Q180099") tidywikidatar:::tw_extract_single(w = list(tw_test_items[["Q180099"]]))#' Retrieving from tests, but normally: # w <- tw_get_item(id = "Q180099") tidywikidatar:::tw_extract_single(w = list(tw_test_items[["Q180099"]]))
Filter search result and keep only items with matching property and Q identifier
tw_filter( search, p, q, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_filter( search, p, q, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A data frame generated by |
p |
A character vector of length 1, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
q |
A character vector of length 1, a wikidata id. Must always start with the capital letter "Q", e.g. "Q5" for "human being". |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
include_search |
Logical, defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A data frame with three columns, id, label, and description,
filtered by the above criteria.
## Not run: tw_search(search = "Margaret Mead", limit = 3) %>% tw_filter(p = "P31", q = "Q5") ## End(Not run)## Not run: tw_search(search = "Margaret Mead", limit = 3) %>% tw_filter(p = "P31", q = "Q5") ## End(Not run)
Same as tw_filter(), but consistently returns data frames with a single
row.
tw_filter_first( search, p, q, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_filter_first( search, p, q, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A data frame generated by |
p |
A character vector of length 1, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
q |
A character vector of length 1, a wikidata id. Must always start with the capital letter "Q", e.g. "Q5" for "human being". |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
include_search |
Logical, defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A data frame with one row and three columns, id, label, and
description, filtered by the above criteria.
## Not run: tw_search("Margaret Mead") %>% tw_filter_first(p = "P31", q = "Q5") ## End(Not run)## Not run: tw_search("Margaret Mead") %>% tw_filter_first(p = "P31", q = "Q5") ## End(Not run)
A wrapper of tw_filter() that defaults to keep only "instance of" (P31)
"human being" (Q5).
tw_filter_people( search, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, stop_at_first = TRUE, wait = 0, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_filter_people( search, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, stop_at_first = TRUE, wait = 0, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A data frame generated by |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
include_search |
Logical, defaults to |
stop_at_first |
Logical, defaults to TRUE. If TRUE, returns only the first match from the search that satisfies the criteria. |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A data frame with three columns, id, label, and description;
all rows refer to a human being.
## Not run: tw_search("Ruth Benedict") tw_search("Ruth Benedict") %>% tw_filter_people() ## End(Not run)## Not run: tw_search("Ruth Benedict") tw_search("Ruth Benedict") %>% tw_filter_people() ## End(Not run)
Return (most) information from a Wikidata item in a tidy format
tw_get( id, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, id_l = NULL, user_agent = tidywikidatar::tw_get_user_agent() )tw_get( id, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, id_l = NULL, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
id_l |
Defaults to |
user_agent |
Defaults to |
A data.frame (a tibble) with three columns (id, property, and
value).
if (interactive()) { tw_get( id = c("Q180099", "Q228822"), language = "en" ) } ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tw_get( id = c("Q180099", "Q228822"), language = "en", id_l = tw_test_items )if (interactive()) { tw_get( id = c("Q180099", "Q228822"), language = "en" ) } ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tw_get( id = c("Q180099", "Q228822"), language = "en", id_l = tw_test_items )
This function does not cache results.
tw_get_all_with_p( p, fields = c("item", "itemLabel", "itemDescription"), language = tidywikidatar::tw_get_language(), method = "SPARQL", wait = 0.1, limit = Inf, return_as_tw_search = TRUE, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_all_with_p( p, fields = c("item", "itemLabel", "itemDescription"), language = tidywikidatar::tw_get_language(), method = "SPARQL", wait = 0.1, limit = Inf, return_as_tw_search = TRUE, user_agent = tidywikidatar::tw_get_user_agent() )
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
fields |
A character vector of Wikidata fields. Ignored if
|
language |
Defaults to language set with |
method |
Defaults to "SPARQL". The only accepted alternative value is "JSON", to use instead json-based API. |
wait |
Defaults to 0.1. Used only in method is set to "JSON". |
limit |
Defaults to |
return_as_tw_search |
Logical, defaults to |
user_agent |
Defaults to a combination of |
A data frame with three columns is method is set to "SPARQL", or as
many columns as fields if more are given and return_as_tw_search is set
to FALSE. A single column with Wikidata identifier if method is set to
"JSON".
if (interactive()) { # get all Wikidata items with an ICAO airport code ("P239") tw_get_all_with_p(p = "P239", limit = 10) }if (interactive()) { # get all Wikidata items with an ICAO airport code ("P239") tw_get_all_with_p(p = "P239", limit = 10) }
Typically set with tw_set_cache_db().
tw_get_cache_db()tw_get_cache_db()
A list with all database parameters as stored in environment variables.
tw_get_cache_db()tw_get_cache_db()
Gets location of cache file
tw_get_cache_file( extension = "sqlite", language = tidywikidatar::tw_get_language() )tw_get_cache_file( extension = "sqlite", language = tidywikidatar::tw_get_language() )
extension |
Defaults to |
language |
Defaults to language set with |
A character vector of length one with location of item cache file.
tw_set_cache_folder(path = tempdir()) sqlite_cache_file_location <- tw_get_cache_file() # outputs location of cache filetw_set_cache_folder(path = tempdir()) sqlite_cache_file_location <- tw_get_cache_file() # outputs location of cache file
Gets name of table inside the database
tw_get_cache_table_name( type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language() )tw_get_cache_table_name( type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language() )
type |
Defaults to "item". Type of cache file to output. Values
typically used by |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
A character vector of length one with the name of the relevant table in the cache file.
# outputs name of table used in the cache database tw_get_cache_table_name(type = "item", language = "en")# outputs name of table used in the cache database tw_get_cache_table_name(type = "item", language = "en")
Retrieve cached item
tw_get_cached_item( id, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_get_cached_item( id, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If data present in cache, returns a data frame with cached data.
tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get(id = "Q180099", language = "en") df_from_cache <- tw_get_cached_item( id = "Q180099", language = "en" )tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get(id = "Q180099", language = "en") df_from_cache <- tw_get_cached_item( id = "Q180099", language = "en" )
Retrieve cached qualifier
tw_get_cached_qualifiers( id, p, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_get_cached_qualifiers( id, p, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If data present in cache, returns a data frame with cached data.
tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_qualifiers(id = "Q180099", p = "P26", language = "en") df_from_cache <- tw_get_cached_qualifiers( id = "Q180099", p = "P26", language = "en" ) df_from_cachetw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_qualifiers(id = "Q180099", p = "P26", language = "en") df_from_cache <- tw_get_cached_qualifiers( id = "Q180099", p = "P26", language = "en" ) df_from_cache
Retrieve cached search
tw_get_cached_search( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, include_search = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_get_cached_search( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, include_search = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A string to be searched in Wikidata |
type |
Defaults to "item". Either "item" or "property". |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
cache |
Defaults to |
include_search |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If data present in cache, returns a data frame with cached data.
## Not run: tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) search_from_api <- tw_search("Sylvia Pankhurst") search_from_api df_from_cache <- tw_get_cached_search("Sylvia Pankhurst") df_from_cache ## End(Not run)## Not run: tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) search_from_api <- tw_search("Sylvia Pankhurst") search_from_api df_from_cache <- tw_get_cached_search("Sylvia Pankhurst") df_from_cache ## End(Not run)
Mostly used internally.
tw_get_cached_wikipedia_category_members( category, type = "page", language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_get_cached_wikipedia_category_members( category, type = "page", language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
category |
Title of a Wikipedia category page or final parts of its url. Must include "Category:", or equivalent in other languages. If given, url can be left empty, but language must be provided. |
type |
Defaults to "page", defines which kind of members of a category
are returned. Valid values include "page", "file", and "subcat" (for
sub-category). Corresponds to |
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If data present in cache, returns a data frame with cached data.
if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(category = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_category_members( category = "Margaret Mead", language = "en" ) df_from_cache }if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(category = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_category_members( category = "Margaret Mead", language = "en" ) df_from_cache }
Mostly used internally.
tw_get_cached_wikipedia_page_links( title, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_get_cached_wikipedia_page_links( title, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Two-letter language code used to define the Wikipedia version
to use. Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If data present in cache, returns a data frame with cached data.
if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_page_links( title = "Margaret Mead", language = "en" ) df_from_cache }if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_page_links( title = "Margaret Mead", language = "en" ) df_from_cache }
Mostly used internally.
tw_get_cached_wikipedia_page_qid( title, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_get_cached_wikipedia_page_qid( title, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If data present in cache, returns a data frame with cached data.
if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_page_qid( title = "Margaret Mead", language = "en" ) df_from_cache }if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_page_qid( title = "Margaret Mead", language = "en" ) df_from_cache }
Mostly used internally.
tw_get_cached_wikipedia_page_sections( title, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_get_cached_wikipedia_page_sections( title, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
If data present in cache, returns a data frame with cached data.
if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_page_sections( title = "Margaret Mead", language = "en" ) df_from_cache }if (interactive()) { tw_set_cache_folder(path = tempdir()) tw_enable_cache() tw_create_cache_folder(ask = FALSE) df_from_api <- tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") df_from_cache <- tw_get_cached_wikipedia_page_sections( title = "Margaret Mead", language = "en" ) df_from_cache }
Get Wikidata description in given language
tw_get_description( id, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_description( id, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector of length 1, must start with Q, e.g. "Q254" for Wolfgang Amadeus Mozart. |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A character vector of the same length as the vector of id given, with the Wikidata description in the requested language.
## Not run: tw_get_description( id = c( "Q180099", "Q228822" ), language = "en" ) ## End(Not run)## Not run: tw_get_description( id = c( "Q180099", "Q228822" ), language = "en" ) ## End(Not run)
tw_get()
Gets a field such a label or description from a data frame typically generated
with tw_get()
tw_get_field(df, field, id, language = tidywikidatar::tw_get_language())tw_get_field(df, field, id, language = tidywikidatar::tw_get_language())
df |
A data frame typically generated with |
field |
A character vector of length one. Typically, either "label" or "description". |
id |
A character vector, typically of Wikidata identifiers. The output will be of the same length and in the same order as the identifiers provided with this parameter. |
language |
Defaults to language set with |
A character vector of the same length, and with data in the same
order, as id.
## Not run: tw_get("Q180099") %>% tw_get_field(field = "label", id = "Q180099") ## End(Not run) # using test item for the sake of this example tw_get("Q180099", id_l = tw_test_items) %>% tw_get_field(field = "label", id = "Q180099")## Not run: tw_get("Q180099") %>% tw_get_field(field = "label", id = "Q180099") ## End(Not run) # using test item for the sake of this example tw_get("Q180099", id_l = tw_test_items) %>% tw_get_field(field = "label", id = "Q180099")
Please consult the relevant documentation for reusing content outside Wikimedia.
tw_get_image( id, format = "filename", width = NULL, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_image( id, format = "filename", width = NULL, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector of length 1, must start with Q, e.g. "Q254" for Wolfgang Amadeus Mozart. |
format |
A character vector, defaults to |
width |
A numeric value, defaults to |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A data frame of two columns, id and image, corresponding to
reference to the image in the requested format.
## Not run: tw_get_image("Q180099", format = "filename" ) tw_get_image("Q180099", format = "commons" ) tw_get_image("Q180099", format = "embed", width = 300 ) ## End(Not run)## Not run: tw_get_image("Q180099", format = "filename" ) tw_get_image("Q180099", format = "commons" ) tw_get_image("Q180099", format = "embed", width = 300 ) ## End(Not run)
Please consult the relevant documentation for reusing content outside Wikimedia.
tw_get_image_metadata( id, image_filename = NULL, only_first = TRUE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, attempts = 10, wait = 1, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_image_metadata( id, image_filename = NULL, only_first = TRUE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, attempts = 10, wait = 1, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector of length 1, must start with Q, e.g. "Q254" for Wolfgang Amadeus Mozart. |
image_filename |
Defaults to |
only_first |
Defaults to |
language |
Defaults to language set with |
id_df |
Default to NULL. If given, it should be a dataframe typically
generated with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A character vector, corresponding to reference to the image in the requested format.
## Not run: tw_get_image_metadata("Q180099") ## End(Not run)## Not run: tw_get_image_metadata("Q180099") ## End(Not run)
Please consult the relevant documentation for reusing content outside Wikimedia.
tw_get_image_metadata_single( id, image_filename = NULL, only_first = TRUE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, read_cache = TRUE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_image_metadata_single( id, image_filename = NULL, only_first = TRUE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, read_cache = TRUE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector of length 1, must start with Q, e.g. "Q254" for Wolfgang Amadeus Mozart. |
image_filename |
Defaults to |
only_first |
Defaults to |
language |
Defaults to language set with |
id_df |
Default to NULL. If given, it should be a dataframe typically
generated with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
read_cache |
Logical, defaults to TRUE. Mostly used internally to prevent checking if an item is in cache if it is already known that it is not in cache. |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A character vector, corresponding to reference to the image in the requested format.
if (interactive()) { tw_get_image_metadata_single("Q180099") }if (interactive()) { tw_get_image_metadata_single("Q180099") }
Please consult the relevant documentation for reusing content outside Wikimedia.
tw_get_image_same_length( id, format = "filename", as_tibble = FALSE, only_first = TRUE, width = NULL, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_image_same_length( id, format = "filename", as_tibble = FALSE, only_first = TRUE, width = NULL, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector of length 1, must start with Q, e.g. "Q254" for Wolfgang Amadeus Mozart. |
format |
A character vector, defaults to |
as_tibble |
Defaults to |
only_first |
Defaults to |
width |
A numeric value, defaults to |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A character vector, corresponding to reference to the image in the requested format.
## Not run: tw_get_image_same_length("Q180099", format = "filename" ) tw_get_image_same_length("Q180099", format = "commons" ) tw_get_image_same_length("Q180099", format = "embed", width = 300 ) ## End(Not run)## Not run: tw_get_image_same_length("Q180099", format = "filename" ) tw_get_image_same_length("Q180099", format = "commons" ) tw_get_image_same_length("Q180099", format = "embed", width = 300 ) ## End(Not run)
Retrieve item from the Wikidata API and returns it as a list
tw_get_item(id, user_agent = tidywikidatar::tw_get_user_agent(), retry = 10)tw_get_item(id, user_agent = tidywikidatar::tw_get_user_agent(), retry = 10)
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
user_agent |
Defaults to |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
A list, with as many elements as the unique given id.
## Not run: item_l <- tw_get_item(id = "Q180099") tidywikidatar:::tw_extract_single(w = item_l) ## End(Not run)## Not run: item_l <- tw_get_item(id = "Q180099") tidywikidatar:::tw_extract_single(w = item_l) ## End(Not run)
Get Wikidata label in given language
tw_get_label( id, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_label( id, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector of length 1, must start with Q, e.g. "Q254" for Wolfgang Amadeus Mozart. |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A character vector of the same length as the vector of id given, with the Wikidata label in the requested language.
## Not run: tw_get_label( id = c( "Q180099", "Q228822" ), language = "en" ) # If a label is not available, a NA value is returned tw_get_label( id = c( "Q64733534", "Q4773904", "Q220480" ), language = "sc" ) ## End(Not run)## Not run: tw_get_label( id = c( "Q180099", "Q228822" ), language = "en" ) # If a label is not available, a NA value is returned tw_get_label( id = c( "Q64733534", "Q4773904", "Q220480" ), language = "sc" ) ## End(Not run)
Efficiently get a wide table with various properties of a given set of Wikidata identifiers
tw_get_p_wide( id, p, label = FALSE, property_label_as_column_name = FALSE, both_id_and_label = FALSE, only_first = FALSE, preferred = FALSE, unlist = FALSE, collapse = ";", language = tidywikidatar::tw_get_language(), id_df = NULL, id_df_label = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )tw_get_p_wide( id, p, label = FALSE, property_label_as_column_name = FALSE, both_id_and_label = FALSE, only_first = FALSE, preferred = FALSE, unlist = FALSE, collapse = ";", language = tidywikidatar::tw_get_language(), id_df = NULL, id_df_label = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
label |
Logical, defaults to |
property_label_as_column_name |
Logical, defaults to |
both_id_and_label |
Logical, defaults to |
only_first |
Logical, defaults to |
preferred |
Logical, defaults to |
unlist |
Logical, defaults to |
collapse |
Defaults to ";". Character used to separate results when
|
language |
Defaults to language set with |
id_df |
Default to |
id_df_label |
Defaults to NULL. If given, it should be a dataframe
typically generated with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
A data frame, with a column for each given property.
if (interactive()) { tw_get_p_wide( id = c("Q180099", "Q228822", "Q191095"), p = c("P27", "P19", "P20"), label = TRUE, only_first = TRUE ) }if (interactive()) { tw_get_p_wide( id = c("Q180099", "Q228822", "Q191095"), p = c("P27", "P19", "P20"), label = TRUE, only_first = TRUE ) }
This function wraps tw_get_p(), but always sets only_first and
preferred to TRUE in order to give back always a character vector.
tw_get_p1( id, p, latest_start_time = FALSE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_p1( id, p, latest_start_time = FALSE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
latest_start_time |
Logical, defaults to |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A character vector of the same length as the input.
## Not run: tw_get_p1(id = "Q180099", "P26") ## End(Not run)## Not run: tw_get_p1(id = "Q180099", "P26") ## End(Not run)
Get Wikidata property of one or more items as a tidy data frame
tw_get_property( id, p, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_property( id, p, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A tibble, corresponding to the value for the given property. A tibble of zero rows if no relevant property found.
## Not run: # Who were the doctoral advisors - P184 - of Margaret Mead - Q180099? advisors <- tw_get_property(id = "Q180099", p = "P184") advisors # tw_get_label(advisors) # It is also possible to get one property for many id tw_get_property( id = c( "Q180099", "Q228822" ), p = "P31" ) # Or many properties for a single id tw_get_property( id = "Q180099", p = c("P21", "P31") ) ## End(Not run)## Not run: # Who were the doctoral advisors - P184 - of Margaret Mead - Q180099? advisors <- tw_get_property(id = "Q180099", p = "P184") advisors # tw_get_label(advisors) # It is also possible to get one property for many id tw_get_property( id = c( "Q180099", "Q228822" ), p = "P31" ) # Or many properties for a single id tw_get_property( id = "Q180099", p = c("P21", "P31") ) ## End(Not run)
Get description of a Wikidata property in a given language
tw_get_property_description( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )tw_get_property_description( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )
property |
A character vector. Each element must start with P, e.g. "P31". |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
A character vector, with the Wikidata description in the requested language.
## Not run: tw_get_property_description(property = "P31") ## End(Not run)## Not run: tw_get_property_description(property = "P31") ## End(Not run)
Get description of a Wikidata property in a given language
tw_get_property_description_single( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )tw_get_property_description_single( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )
property |
A character vector of length 1, must start with P, e.g. "P31". |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
A character vector of length 1, with the Wikidata description in the requested language.
## Not run: tidywikidatar:::tw_get_property_description_single(property = "P31") ## End(Not run)## Not run: tidywikidatar:::tw_get_property_description_single(property = "P31") ## End(Not run)
Get label of a Wikidata property in a given language
tw_get_property_label( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )tw_get_property_label( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )
property |
A character vector. Each element must start with P, e.g. "P31". |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
A character vector, with the Wikidata label in the requested language.
tw_get_property_label(property = "P31")tw_get_property_label(property = "P31")
Get label of a Wikidata property in a given language
tw_get_property_label_single( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )tw_get_property_label_single( property, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )
property |
A character vector of length 1, must start with P, e.g. "P31". |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
A character vector of length 1, with the Wikidata label in the requested language.
## Not run: tidywikidatar:::tw_get_property_label_single(property = "P31") ## End(Not run)## Not run: tidywikidatar:::tw_get_property_label_single(property = "P31") ## End(Not run)
Get Wikidata property of an item as a vector or list of the same length as input
tw_get_property_same_length( id, p, only_first = FALSE, preferred = FALSE, latest_start_time = FALSE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() ) tw_get_p( id, p, only_first = FALSE, preferred = FALSE, latest_start_time = FALSE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_property_same_length( id, p, only_first = FALSE, preferred = FALSE, latest_start_time = FALSE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() ) tw_get_p( id, p, only_first = FALSE, preferred = FALSE, latest_start_time = FALSE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
only_first |
Logical, defaults to |
preferred |
Logical, defaults to |
latest_start_time |
Logical, defaults to |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A list of the same length as input (or a character vector is
only_first is set to TRUE)
# By default, it returns a list of the same length as input, # no matter how many values for each id/property ## Not run: tw_get_property_same_length( id = c( "Q180099", "Q228822", "Q76857" ), p = "P26" ) # Notice that if no relevant match is found, it returns a NA # This is useful for piped operations tibble::tibble(id = c( "Q180099", "Q228822", "Q76857" )) %>% dplyr::mutate(spouse = tw_get_property_same_length(id, "P26")) # Consider unnesting for further analysis tibble::tibble(id = c( "Q180099", "Q228822", "Q76857" )) %>% dplyr::mutate(spouse = tw_get_property_same_length(id, "P26")) %>% tidyr::unnest(cols = spouse) # If you are sure that you are interested only in the first return value, # consider setting only_first=TRUE to get a character vector rather than a list # Be mindful: you may well be discarding valid values. tibble::tibble(id = c( "Q180099", "Q228822", "Q76857" )) %>% dplyr::mutate(spouse = tw_get_property_same_length(id, "P26", only_first = TRUE )) ## End(Not run) tw_get_p(id = "Q180099", "P26")# By default, it returns a list of the same length as input, # no matter how many values for each id/property ## Not run: tw_get_property_same_length( id = c( "Q180099", "Q228822", "Q76857" ), p = "P26" ) # Notice that if no relevant match is found, it returns a NA # This is useful for piped operations tibble::tibble(id = c( "Q180099", "Q228822", "Q76857" )) %>% dplyr::mutate(spouse = tw_get_property_same_length(id, "P26")) # Consider unnesting for further analysis tibble::tibble(id = c( "Q180099", "Q228822", "Q76857" )) %>% dplyr::mutate(spouse = tw_get_property_same_length(id, "P26")) %>% tidyr::unnest(cols = spouse) # If you are sure that you are interested only in the first return value, # consider setting only_first=TRUE to get a character vector rather than a list # Be mindful: you may well be discarding valid values. tibble::tibble(id = c( "Q180099", "Q228822", "Q76857" )) %>% dplyr::mutate(spouse = tw_get_property_same_length(id, "P26", only_first = TRUE )) ## End(Not run) tw_get_p(id = "Q180099", "P26")
Gets all details of a property for one or more Wikidata items.
tw_get_property_with_details( id, p, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_property_with_details( id, p, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A tibble, corresponding to the details for the given property. NULL
if no relevant property found.
# Get "female form of label", including language tw_get_property_with_details(id = "Q64733534", p = "P2521")# Get "female form of label", including language tw_get_property_with_details(id = "Q64733534", p = "P2521")
Used internally. Users should rely on tw_get_property_with_details().
tw_get_property_with_details_single( id, p, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_property_with_details_single( id, p, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A tibble, corresponding to the details for the given property. NULL
if no relevant property found.
# Get "female form of label", including language ## Not run: tidywikidatar:::tw_get_property_with_details_single(id = "Q64733534", p = "P2521") ## End(Not run)# Get "female form of label", including language ## Not run: tidywikidatar:::tw_get_property_with_details_single(id = "Q64733534", p = "P2521") ## End(Not run)
N.B. In order to provide for consistently structured output, this function outputs either id or value for each qualifier. The user should keep in mind that some of these come with additional detail (e.g. the unit, precision, or reference calendar).
tw_get_qualifiers( id, p, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent(), id_l = NULL )tw_get_qualifiers( id, p, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent(), id_l = NULL )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
id_l |
Defaults to |
A data frame (a tibble) with eight columns: id for the input id,
property, qualifier_id, qualifier_property, qualifier_value,
rank, qualifier_value_type, and set (to distinguish sets of data when
a property is present more than once)
if (interactive()) { tidywikidatar::tw_get_qualifiers(id = "Q180099", p = "P26", language = "en") } #' ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tidywikidatar::tw_get_qualifiers( id = "Q180099", p = "P26", language = "en", id_l = tw_test_items )if (interactive()) { tidywikidatar::tw_get_qualifiers(id = "Q180099", p = "P26", language = "en") } #' ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tidywikidatar::tw_get_qualifiers( id = "Q180099", p = "P26", language = "en", id_l = tw_test_items )
N.B. In order to provide for consistently structured output, this function outputs either id or value for each qualifier. The user should keep in mind that some of these come with additional detail (e.g. the unit, precision, or reference calendar).
tw_get_qualifiers_single( id, p, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent(), id_l = NULL )tw_get_qualifiers_single( id, p, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent(), id_l = NULL )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
p |
A character vector, a property. Must always start with the capital letter "P", e.g. "P31" for "instance of". |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
id_l |
Defaults to |
A data frame (a tibble) with eight columns: id for the input id,
property, qualifier_id, qualifier_property, qualifier_value,
rank, qualifier_value_type, and set (to distinguish sets of data when
a property is present more than once)
if (interactive()) { tidywikidatar:::tw_get_qualifiers_single(id = "Q180099", p = "P26", language = "en") } #' ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tidywikidatar:::tw_get_qualifiers_single( id = "Q180099", p = "P26", language = "en", id_l = tw_test_items )if (interactive()) { tidywikidatar:::tw_get_qualifiers_single(id = "Q180099", p = "P26", language = "en") } #' ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tidywikidatar:::tw_get_qualifiers_single( id = "Q180099", p = "P26", language = "en", id_l = tw_test_items )
Return (most) information from a Wikidata item in a tidy format from a single Wikidata identifier
tw_get_single( id, language = tidywikidatar::tw_get_language(), retry = 10, cache = NULL, overwrite_cache = FALSE, read_cache = TRUE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, id_l = NULL, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_single( id, language = tidywikidatar::tw_get_language(), retry = 10, cache = NULL, overwrite_cache = FALSE, read_cache = TRUE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, id_l = NULL, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector, must start with Q, e.g. "Q180099" for the
anthropologist Margaret Mead. Can also be a data frame of one row,
typically generated with |
language |
Defaults to language set with |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
read_cache |
Logical, defaults to TRUE. Mostly used internally to prevent checking if an item is in cache if it is already known that it is not in cache. |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
id_l |
Defaults to |
user_agent |
Defaults to |
A data.frame (a tibble) with four columns (id, property, value, and
rank). If item not found or trouble connecting with the server, a data
frame with four columns and zero rows is returned, with the warning as an
attribute, which can be retrieved with attr(output, "warning"))
if (interactive()) { tidywikidatar:::tw_get_single( id = "Q180099", language = "en" ) } ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tidywikidatar:::tw_get_single( id = "Q180099", language = "en", id_l = tw_test_items )if (interactive()) { tidywikidatar:::tw_get_single( id = "Q180099", language = "en" ) } ## using `tw_test_items` in examples in order to show output without calling ## on Wikidata servers tidywikidatar:::tw_get_single( id = "Q180099", language = "en", id_l = tw_test_items )
Get URL to a Wikipedia article corresponding to a Wikidata Q identifier in given language
tw_get_wikipedia( id, full_link = TRUE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )tw_get_wikipedia( id, full_link = TRUE, language = tidywikidatar::tw_get_language(), id_df = NULL, cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0, retry = 10, user_agent = tidywikidatar::tw_get_user_agent() )
id |
A character vector of length 1, must start with Q, e.g. "Q254" for Wolfgang Amadeus Mozart. |
full_link |
Logical, defaults to |
language |
Defaults to language set with |
id_df |
Default to |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
user_agent |
Defaults to |
A character vector of the same length as the vector of id given, with the Wikipedia link in the requested language.
## Not run: tw_get_wikipedia(id = "Q180099") ## End(Not run)## Not run: tw_get_wikipedia(id = "Q180099") ## End(Not run)
Mostly used internally
tw_get_wikipedia_base_api_url( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), action = "query", type = "page" )tw_get_wikipedia_base_api_url( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), action = "query", type = "page" )
url |
A character vector with the full URL to one or more Wikipedia pages. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Defaults to language set with |
action |
Defaults to "query". Usually either "query" or "parse". In principle, any valid action value, see: https://www.mediawiki.org/w/api.php |
type |
Defaults to "page". Either "page" or "category". |
A character vector of base urls to be used with the MediaWiki API.
tw_get_wikipedia_base_api_url(title = "Margaret Mead", language = "en") tw_get_wikipedia_base_api_url( title = "Category:American women anthropologists", type = "category", language = "en" )tw_get_wikipedia_base_api_url(title = "Margaret Mead", language = "en") tw_get_wikipedia_base_api_url( title = "Category:American women anthropologists", type = "category", language = "en" )
Get all Wikidata Q identifiers of all Wikipedia pages (or files, or subcategories) that are members of the given category,
tw_get_wikipedia_category_members( url = NULL, category = NULL, type = "page", language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )tw_get_wikipedia_category_members( url = NULL, category = NULL, type = "page", language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )
url |
Full URL to a Wikipedia category page. If given, title and language can be left empty. |
category |
Title of a Wikipedia category page or final parts of its url. Must include "Category:", or equivalent in other languages. If given, url can be left empty, but language must be provided. |
type |
Defaults to "page", defines which kind of members of a category
are returned. Valid values include "page", "file", and "subcat" (for
sub-category). Corresponds to |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
A data frame (a tibble) with eight columns: source_title_url,
source_wikipedia_title, source_qid, wikipedia_title, wikipedia_id,
qid, description, and language.
if (interactive()) { sub_categories <- tw_get_wikipedia_category_members( category = "Category:American women anthropologists", type = "subcat" ) sub_categories tw_get_wikipedia_category_members( category = sub_categories$wikipedia_title, type = "page" ) }if (interactive()) { sub_categories <- tw_get_wikipedia_category_members( category = "Category:American women anthropologists", type = "subcat" ) sub_categories tw_get_wikipedia_category_members( category = sub_categories$wikipedia_title, type = "page" ) }
Get all Wikidata Q identifiers of all Wikipedia pages that appear in a given page
tw_get_wikipedia_category_members_single( url = NULL, category = NULL, type = "page", language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )tw_get_wikipedia_category_members_single( url = NULL, category = NULL, type = "page", language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )
url |
Full URL to a Wikipedia category page. If given, title and language can be left empty. |
category |
Title of a Wikipedia category page or final parts of its url. Must include "Category:", or equivalent in other languages. If given, url can be left empty, but language must be provided. |
type |
Defaults to "page", defines which kind of members of a category
are returned. Valid values include "page", "file", and "subcat" (for
sub-category). Corresponds to |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
A data frame (a tibble) with four columns: wikipedia_title, wikipedia_id, wikidata_id, wikidata_description.
if (interactive()) { tidywikidatar:::tw_get_wikipedia_category_members_single( category = "Category:American women anthropologists", type = "subcat" ) tidywikidatar:::tw_get_wikipedia_category_members_single( category = "Category:Puerto Rican women anthropologists", type = "page" ) }if (interactive()) { tidywikidatar:::tw_get_wikipedia_category_members_single( category = "Category:American women anthropologists", type = "subcat" ) tidywikidatar:::tw_get_wikipedia_category_members_single( category = "Category:Puerto Rican women anthropologists", type = "page" ) }
Get all Wikidata Q identifiers of all Wikipedia pages that appear in one or more pages
tw_get_wikipedia_page_links( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )tw_get_wikipedia_page_links( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )
url |
Full url to a Wikipedia page. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Two-letter language code used to define the Wikipedia version
to use. Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
A data frame (a tibble) with eight columns: source_title_url,
source_wikipedia_title, source_qid, wikipedia_title, wikipedia_id,
qid, description, and language.
if (interactive()) { tw_get_wikipedia_page_links(title = "Margaret Mead", language = "en") }if (interactive()) { tw_get_wikipedia_page_links(title = "Margaret Mead", language = "en") }
Get all Wikidata Q identifiers of all Wikipedia pages that appear in a given page
tw_get_wikipedia_page_links_single( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, wikipedia_page_qid_df = NULL )tw_get_wikipedia_page_links_single( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, wikipedia_page_qid_df = NULL )
url |
Full url to a Wikipedia page. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Two-letter language code used to define the Wikipedia version
to use. Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
wikipedia_page_qid_df |
Defaults to |
A data frame (a tibble) with four columns: wikipedia_title,
wikipedia_id, wikidata_id, wikidata_description.
if (interactive()) { tw_get_wikipedia_page_links_single(title = "Margaret Mead", language = "en") }if (interactive()) { tw_get_wikipedia_page_links_single(title = "Margaret Mead", language = "en") }
Gets the Wikidata Q identifier of one or more Wikipedia pages
tw_get_wikipedia_page_qid( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )tw_get_wikipedia_page_qid( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )
url |
A character vector with the full URL to one or more Wikipedia pages. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
A a data frame with six columns, including qid with Wikidata
identifiers, and a logical disambiguation to flag when disambiguation
pages are returned.
if (interactive()) { tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") # check when Wikipedia returns disambiguation page tw_get_wikipedia_page_qid(title = c("Rome", "London", "New York", "Vienna")) }if (interactive()) { tw_get_wikipedia_page_qid(title = "Margaret Mead", language = "en") # check when Wikipedia returns disambiguation page tw_get_wikipedia_page_qid(title = c("Rome", "London", "New York", "Vienna")) }
Gets the Wikidata id of a Wikipedia page
tw_get_wikipedia_page_qid_single( title = NULL, url = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )tw_get_wikipedia_page_qid_single( title = NULL, url = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
url |
A character vector with the full URL to one or more Wikipedia pages. If given, title and language can be left empty. |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
A data frame (a tibble) with eight columns: title,
wikipedia_title, wikipedia_id, qid, description, disambiguation,
and language.
if (interactive()) { tw_get_wikipedia_page_qid_single(title = "Margaret Mead", language = "en") }if (interactive()) { tw_get_wikipedia_page_qid_single(title = "Margaret Mead", language = "en") }
Get links from a specific section of a Wikipedia page
tw_get_wikipedia_page_section_links( url = NULL, title = NULL, section_title = NULL, section_index = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, wikipedia_page_qid_df = NULL )tw_get_wikipedia_page_section_links( url = NULL, title = NULL, section_title = NULL, section_index = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, wikipedia_page_qid_df = NULL )
url |
Full url to a Wikipedia page. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
section_title |
Defaults to |
section_index |
Defaults to |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
wikipedia_page_qid_df |
Defaults to |
A data frame (a tibble).
if (interactive()) { tw_get_wikipedia_page_section_links(title = "Margaret Mead", language = "en", section_index = 1) }if (interactive()) { tw_get_wikipedia_page_section_links(title = "Margaret Mead", language = "en", section_index = 1) }
Get sections of a Wikipedia page
tw_get_wikipedia_page_sections( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )tw_get_wikipedia_page_sections( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10 )
url |
Full url to a Wikipedia page. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
A data frame (a tibble), with the same columns as
tw_empty_wikipedia_page_sections.
if (interactive()) { tw_get_wikipedia_page_sections(title = "Margaret Mead", language = "en") }if (interactive()) { tw_get_wikipedia_page_sections(title = "Margaret Mead", language = "en") }
Get all Wikidata Q identifiers of all Wikipedia pages that appear in a given page
tw_get_wikipedia_page_sections_single( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, wikipedia_page_qid_df = NULL )tw_get_wikipedia_page_sections_single( url = NULL, title = NULL, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 1, attempts = 10, wikipedia_page_qid_df = NULL )
url |
Full url to a Wikipedia page. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
attempts |
Defaults to 10. Number of times it re-attempts to reach the API before failing. |
wikipedia_page_qid_df |
Defaults to |
A data frame (a tibble) with four columns: wikipedia_title,
wikipedia_id, wikidata_id, wikidata_description.
if (interactive()) { tw_get_wikipedia_page_sections_single(title = "Margaret Mead", language = "en") }if (interactive()) { tw_get_wikipedia_page_sections_single(title = "Margaret Mead", language = "en") }
Mostly used internally
tw_get_wikipedia_section_links_api_url( url = NULL, title = NULL, section_index, language = tidywikidatar::tw_get_language() )tw_get_wikipedia_section_links_api_url( url = NULL, title = NULL, section_index, language = tidywikidatar::tw_get_language() )
url |
A character vector with the full URL to one or more Wikipedia pages. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
section_index |
Required. It should correspond to the ordinal of a
section of the relevant Wikipedia page. See also
|
language |
Two-letter language code used to define the Wikipedia version
to use. Defaults to language set with |
A character vector of base urls to be used with the MediaWiki API
tw_get_wikipedia_section_links_api_url(title = "Margaret Mead", section_index = 1, language = "en")tw_get_wikipedia_section_links_api_url(title = "Margaret Mead", section_index = 1, language = "en")
Mostly used internally
tw_get_wikipedia_sections_api_url( url = NULL, title = NULL, language = tidywikidatar::tw_get_language() )tw_get_wikipedia_sections_api_url( url = NULL, title = NULL, language = tidywikidatar::tw_get_language() )
url |
Full url to a Wikipedia page. If given, title and language can be left empty. |
title |
Title of a Wikipedia page or final parts of its url. If given, url can be left empty, but language must be provided. |
language |
Defaults to language set with |
A character vector of base urls to be used with the MediaWiki API
tw_get_wikipedia_sections_api_url(title = "Margaret Mead", language = "en")tw_get_wikipedia_sections_api_url(title = "Margaret Mead", language = "en")
Tested only with SQLite and MySql. May work with other drivers.
tw_index_cache_item( table_name = NULL, check_first = TRUE, type = "item", show_details = FALSE, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_index_cache_item( table_name = NULL, check_first = TRUE, type = "item", show_details = FALSE, language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
table_name |
Name of the table in the database. If given, it takes precedence over other parameters. |
check_first |
Logical, defaults to |
type |
Defaults to "item". Type of cache file to output. Values
typically used by |
show_details |
Logical, defaults to |
language |
Language to be used for the search. Can be set once per
session with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
To ensure smooth functioning, the search column in the cache table is
transformed into a column of type varchar and length 255.
If show_details is set to FALSE, nothing, used only for its side
effects (add index to caching table). If TRUE, a data frame, same as the
output of tw_check_cache_index(show_details = TRUE).
if (interactive()) { tw_enable_cache() tw_set_cache_folder(path = fs::path( fs::path_home_r(), "R", "tw_data" )) tw_index_cache_search() }if (interactive()) { tw_enable_cache() tw_set_cache_folder(path = fs::path( fs::path_home_r(), "R", "tw_data" )) tw_index_cache_search() }
Tested only with SQLite and MySql. May work with other drivers.
tw_index_cache_search( table_name = NULL, check_first = TRUE, type = "item", show_details = FALSE, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )tw_index_cache_search( table_name = NULL, check_first = TRUE, type = "item", show_details = FALSE, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE )
table_name |
Name of the table in the database. If given, it takes precedence over other parameters. |
check_first |
Logical, defaults to |
type |
Defaults to "item". Type of cache file to output. Values
typically used by |
show_details |
Logical, defaults to |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
To ensure smooth functioning, the search column in the cache table is
transformed into a column of type varchar and length 255.
If show_details is set to FALSE, nothing, used only for its side
effects (add index to caching table). If TRUE, a data frame, same as the
output of tw_check_cache_index(show_details = TRUE).
if (interactive()) { tw_enable_cache() tw_set_cache_folder(path = fs::path( fs::path_home_r(), "R", "tw_data" )) tw_index_cache_search() }if (interactive()) { tw_enable_cache() tw_set_cache_folder(path = fs::path( fs::path_home_r(), "R", "tw_data" )) tw_index_cache_search() }
Gets labels for all columns with names such as "id" and "property".
tw_label( df, value = TRUE, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )tw_label( df, value = TRUE, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )
df |
A data frame, typically generated with other |
value |
Logical, defaults to |
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
A data frame, with the same shape as the input data frame, but with labels instead of identifiers.
if (interactive()) { tw_get_qualifiers(id = "Q180099", p = "P26", language = "en") %>% head(2) %>% tw_label() }if (interactive()) { tw_get_qualifiers(id = "Q180099", p = "P26", language = "en") %>% head(2) %>% tw_label() }
avia_par_ datasetThe Wikidata Q identifier of European airports found in Eurostat's avia_par_ dataset
tw_qid_airportstw_qid_airports
A data frame with 429 rows and 1 column:
Q identifiers
https://www.wikidata.org/wiki/Wikidata:Main_Page
A dataset with all the Wikidata items that have "Q27169" (member of the European Parliament) for the property "P39" (position held).
tw_qid_mepstw_qid_meps
A data frame with 4581 rows and 1 column:
Q identifiers
https://www.wikidata.org/wiki/Wikidata:Main_Page
This function aims to facilitate only the most basic type of queries: return which items have the following property pairs. For more details on Wikidata queries, the examples in the official documentation.
tw_query( query, fields = c("item", "itemLabel", "itemDescription"), language = tidywikidatar::tw_get_language(), return_as_tw_search = TRUE, user_agent = tidywikidatar::tw_get_user_agent() )tw_query( query, fields = c("item", "itemLabel", "itemDescription"), language = tidywikidatar::tw_get_language(), return_as_tw_search = TRUE, user_agent = tidywikidatar::tw_get_user_agent() )
query |
A list of named vectors, or a data frame (see example and readme). |
fields |
A character vector of Wikidata fields. Ignored if
|
language |
Defaults to language set with |
return_as_tw_search |
Logical, defaults to |
user_agent |
Defaults to a combination of |
Consider tw_get_all_with_p() if you want to get all items with a given
property, irrespective of the value.
A data frame
if (interactive()) { query <- list( c(p = "P106", q = "Q1397808"), c(p = "P21", q = "Q6581072") ) tw_query(query) }if (interactive()) { query <- list( c(p = "P106", q = "Q1397808"), c(p = "P21", q = "Q6581072") ) tw_query(query) }
Removes the table where items are cached
tw_reset_item_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )tw_reset_item_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
ask |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_reset_item_cache() }if (interactive()) { tw_reset_item_cache() }
Removes the table where qualifiers are cached
tw_reset_qualifiers_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )tw_reset_qualifiers_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
ask |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_reset_qualifiers_cache() }if (interactive()) { tw_reset_qualifiers_cache() }
Removes from cache the table where data typically gathered with
tw_get_wikipedia_category_members() are stored.
tw_reset_wikipedia_category_members_cache( language = tidywikidatar::tw_get_language(), type = "page", cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )tw_reset_wikipedia_category_members_cache( language = tidywikidatar::tw_get_language(), type = "page", cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )
language |
Two-letter language code used to define the Wikipedia version
to use. Defaults to language set with |
type |
Defaults to "page", defines which kind of members of a category
are returned. Valid values include "page", "file", and "subcat" (for
sub-category). Corresponds to |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
ask |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_reset_wikipedia_category_members_cache() }if (interactive()) { tw_reset_wikipedia_category_members_cache() }
Removes the table where data typically gathered with
tw_get_wikipedia_page_qid() from cache.
tw_reset_wikipedia_page_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )tw_reset_wikipedia_page_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
ask |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_reset_wikipedia_page_cache() }if (interactive()) { tw_reset_wikipedia_page_cache() }
Removes from cache the table where data typically gathered with
tw_get_wikipedia_page_links() are stored.
tw_reset_wikipedia_page_links_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )tw_reset_wikipedia_page_links_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )
language |
Two-letter language code used to define the Wikipedia version
to use. Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
ask |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_reset_wikipedia_page_links_cache() }if (interactive()) { tw_reset_wikipedia_page_links_cache() }
Removes from cache the table where data typically gathered with
tw_get_wikipedia_page_sections() are stored.
tw_reset_wikipedia_page_sections_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )tw_reset_wikipedia_page_sections_cache( language = tidywikidatar::tw_get_language(), cache = NULL, cache_connection = NULL, disconnect_db = TRUE, ask = TRUE )
language |
Defaults to language set with |
cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
ask |
Logical, defaults to |
Nothing, used for its side effects.
if (interactive()) { tw_reset_wikipedia_page_sections_cache() }if (interactive()) { tw_reset_wikipedia_page_sections_cache() }
By defaults, this search returns items. Set type to property or use
tw_search_property() for properties.
tw_search( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, retry = 10, include_search = FALSE, wait = 0, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_search( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, retry = 10, include_search = FALSE, wait = 0, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A string to be searched in Wikidata |
type |
Defaults to "item". Either "item" or "property". |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
include_search |
Logical, defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
user_agent |
Defaults to |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A data frame (a tibble) with three columns (id, label, and
description), and as many rows as there are results (by default, limited
to 10). Four columns when include_search is set to TRUE.
tw_search(search = c("Margaret Mead", "Ruth Benedict"))tw_search(search = c("Margaret Mead", "Ruth Benedict"))
This search returns only items, use tw_search_property() for properties.
tw_search_item( search, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_search_item( search, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A string to be searched in Wikidata |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
include_search |
Logical, defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
user_agent |
Defaults to |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A data frame (a tibble) with three columns (id, label, and
description), and as many rows as there are results (by default, limited
to 10).
## Not run: tw_search_item(search = "Sylvia Pankhurst") ## End(Not run)## Not run: tw_search_item(search = "Sylvia Pankhurst") ## End(Not run)
This search returns only properties, use tw_search_item() for properties.
tw_search_property( search, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_search_property( search, language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, include_search = FALSE, wait = 0, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search |
A string to be searched in Wikidata |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
include_search |
Logical, defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
user_agent |
Defaults to |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
A data frame (a tibble) with three columns (id, label, and
description), and as many rows as there are results (by default, limited
to 10).
tw_search_property(search = "gender")tw_search_property(search = "gender")
This search returns only items, use tw_search_property() for properties.
tw_search_single( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, retry = 10, include_search = FALSE, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )tw_search_single( search, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), limit = 10, retry = 10, include_search = FALSE, user_agent = tidywikidatar::tw_get_user_agent(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE, wait = 0 )
search |
A string to be searched in Wikidata |
type |
Defaults to "item". Either "item" or "property". |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
limit |
Maximum numbers of responses to be given. |
retry |
Defaults to 10. Maximum number of times to retry if the API
throws an error, such as "too many requests". Each time, it will wait as
much time as requested by the API. Notice that this can be a long time,
e.g. 30 minutes. Set to |
include_search |
Logical, defaults to |
user_agent |
Defaults to |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
wait |
In seconds, defaults to 0. Time to wait between queries to Wikidata. If data are cached locally, wait time is not applied. If you are running many queries systematically you may want to add some waiting time between queries. |
A data frame (a tibble) with three columns (id, label, and
description), and as many rows as there are results (by default, limited
to 10). Four columns when include_search is set to TRUE.
## Not run: tidywikidatar:::tw_search_single(search = "Sylvia Pankhurst") ## End(Not run)## Not run: tidywikidatar:::tw_search_single(search = "Sylvia Pankhurst") ## End(Not run)
Set database connection settings for the session
tw_set_cache_db( db_settings = NULL, driver = NULL, host = NULL, server = NULL, port = NULL, database = NULL, user = NULL, pwd = NULL )tw_set_cache_db( db_settings = NULL, driver = NULL, host = NULL, server = NULL, port = NULL, database = NULL, user = NULL, pwd = NULL )
db_settings |
A list of database connection settings (see example) |
driver |
A database driver. Common database drivers include |
host |
Host address, e.g. "localhost". Different drivers use server or host parameter, only one of them is likely needed. |
server |
Server address, e.g. "localhost". Different drivers use server or host parameter, only one of them is likely needed. |
port |
Port to use to connect to the database. |
database |
Database name. |
user |
Database user name. |
pwd |
Password for the database user. |
A list with all given parameters (invisibly).
if (interactive()) { # Settings can be provided either as a list db_settings <- list( driver = "MySQL", host = "localhost", server = "localhost", port = 3306, database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) tw_set_cache_db(db_settings) # or as parameters tw_set_cache_db( driver = "MySQL", host = "localhost", server = "localhost", port = 3306, database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) # or ignoring fields that can be left to default values, such as "localhost" and port 3306 tw_set_cache_db( driver = "MySQL", database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) }if (interactive()) { # Settings can be provided either as a list db_settings <- list( driver = "MySQL", host = "localhost", server = "localhost", port = 3306, database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) tw_set_cache_db(db_settings) # or as parameters tw_set_cache_db( driver = "MySQL", host = "localhost", server = "localhost", port = 3306, database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) # or ignoring fields that can be left to default values, such as "localhost" and port 3306 tw_set_cache_db( driver = "MySQL", database = "tidywikidatar", user = "secret_username", pwd = "secret_password" ) }
Consider using a folder out of your current project directory, e.g.
tw_set_cache_folder("~/R/tw_data/"): you will be able to use the same cache
in different projects, and prevent cached files from being sync-ed if you use
services such as Nextcloud or Dropbox.
tw_set_cache_folder(path = NULL) tw_get_cache_folder(path = NULL)tw_set_cache_folder(path = NULL) tw_get_cache_folder(path = NULL)
path |
A path to a location used for caching data. If the folder does not exist, it will be created. |
The path to the caching folder, if previously set; the same path as
given to the function; or the default, tw_data is none is given.
if (interactive()) { tw_set_cache_folder(fs::path(fs::path_home_r(), "R", "tw_data")) } tw_get_cache_folder()if (interactive()) { tw_set_cache_folder(fs::path(fs::path_home_r(), "R", "tw_data")) } tw_get_cache_folder()
Defaults to "en".
tw_set_language(language = NULL) tw_get_language(language = NULL)tw_set_language(language = NULL) tw_get_language(language = NULL)
language |
A character vector of length one, with a string of two letters such as "en". For a full list of available values, see: https://www.wikidata.org/wiki/Help:Wikimedia_language_codes/lists/all |
A two letter code for the language, if previously set; the same language as given to the function; or the default, en is none is given.
if (interactive()) { tw_set_language(language = "en") } tw_get_language()if (interactive()) { tw_set_language(language = "en") } tw_get_language()
Defaults to current package name (tidywikidatar) and version.
tw_set_user_agent(user_agent = NULL) tw_get_user_agent(user_agent = NULL)tw_set_user_agent(user_agent = NULL) tw_get_user_agent(user_agent = NULL)
user_agent |
Defaults to |
The user agent set for the session, implicitly.
# Default user agent default_user_agent <- tw_get_user_agent() default_user_agent # Custom user agent tw_set_user_agent(user_agent = "custom_project_name/email") new_user_agent <- tw_get_user_agent() new_user_agent # Restore tw_set_user_agent(user_agent = default_user_agent) tw_get_user_agent()# Default user agent default_user_agent <- tw_get_user_agent() default_user_agent # Custom user agent tw_set_user_agent(user_agent = "custom_project_name/email") new_user_agent <- tw_get_user_agent() new_user_agent # Restore tw_set_user_agent(user_agent = default_user_agent) tw_get_user_agent()
tw_get_item()
A list mostly used for testing with some Wikidata items in the format resulting from tw_get_item()
tw_test_itemstw_test_items
A list, an object such as the one resulting from tw_get_item()
Writes item to cache. Typically used internally, but exported to enable custom caching solutions.
tw_write_item_to_cache( item_df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_write_item_to_cache( item_df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
item_df |
A data frame with three columns typically generated with
|
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
Nothing, used for its side effects.
tw_set_cache_folder(path = fs::path(tempdir(), paste(sample(letters, 24), collapse = ""))) tw_create_cache_folder(ask = FALSE) tw_disable_cache() df_from_api <- tw_get(id = "Q180099", language = "en") df_from_cache <- tw_get_cached_item( id = "Q180099", language = "en" ) is.null(df_from_cache) # expect TRUE, as nothing has yet been stored in cache tw_write_item_to_cache( item_df = df_from_api, language = "en", cache = TRUE ) df_from_cache <- tw_get_cached_item( id = "Q180099", language = "en", cache = TRUE ) is.null(df_from_cache) # expect a data frame, same as df_from_apitw_set_cache_folder(path = fs::path(tempdir(), paste(sample(letters, 24), collapse = ""))) tw_create_cache_folder(ask = FALSE) tw_disable_cache() df_from_api <- tw_get(id = "Q180099", language = "en") df_from_cache <- tw_get_cached_item( id = "Q180099", language = "en" ) is.null(df_from_cache) # expect TRUE, as nothing has yet been stored in cache tw_write_item_to_cache( item_df = df_from_api, language = "en", cache = TRUE ) df_from_cache <- tw_get_cached_item( id = "Q180099", language = "en", cache = TRUE ) is.null(df_from_cache) # expect a data frame, same as df_from_api
Mostly used internally by tidywikidatar, use with caution to keep caching
consistent.
tw_write_qid_of_wikipedia_page_to_cache( df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_write_qid_of_wikipedia_page_to_cache( df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
df |
A data frame typically generated with
|
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
Silently returns the same data frame provided as input. Mostly used internally for its side effects.
if (interactive()) { df <- tw_get_wikipedia_page_qid( title = "Margaret Mead", language = "en", cache = FALSE ) tw_write_qid_of_wikipedia_page_to_cache( df = df, language = "en" ) }if (interactive()) { df <- tw_get_wikipedia_page_qid( title = "Margaret Mead", language = "en", cache = FALSE ) tw_write_qid_of_wikipedia_page_to_cache( df = df, language = "en" ) }
Mostly to be used internally by tidywikidatar, use with caution to keep
caching consistent.
tw_write_qualifiers_to_cache( qualifiers_df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_write_qualifiers_to_cache( qualifiers_df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
qualifiers_df |
A data frame typically generated with
|
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
Silently returns the same data frame provided as input. Mostly used internally for its side effects.
if (interactive()) { q_df <- tw_get_qualifiers( id = "Q180099", p = "P26", language = "en", cache = FALSE ) tw_write_qualifiers_to_cache( qualifiers_df = q_df, language = "en", cache = TRUE ) }if (interactive()) { q_df <- tw_get_qualifiers( id = "Q180099", p = "P26", language = "en", cache = FALSE ) tw_write_qualifiers_to_cache( qualifiers_df = q_df, language = "en", cache = TRUE ) }
Writes search to cache. Typically used internally, but exported to enable custom caching solutions.
tw_write_search_to_cache( search_df, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_write_search_to_cache( search_df, type = "item", language = tidywikidatar::tw_get_language(), response_language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
search_df |
A data frame with four columns typically generated with
|
type |
Defaults to "item". Either "item" or "property". |
language |
Language to be used for the search. Can be set once per
session with |
response_language |
Language to be used for the returned labels and
descriptions. Corresponds to the |
cache |
Defaults to |
overwrite_cache |
Defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
Nothing, used for its side effects.
## Not run: tw_set_cache_folder(path = fs::path(tempdir(), paste(sample(letters, 24), collapse = ""))) tw_create_cache_folder(ask = FALSE) tw_disable_cache() search_from_api <- tw_search(search = "Sylvia Pankhurst", include_search = TRUE) search_from_cache <- tw_get_cached_search("Sylvia Pankhurst") nrow(search_from_cache) == 0 # expect TRUE, as nothing has yet been stored in cache tw_write_search_to_cache(search_df = search_from_api) search_from_cache <- tw_get_cached_search("Sylvia Pankhurst") search_from_cache ## End(Not run)## Not run: tw_set_cache_folder(path = fs::path(tempdir(), paste(sample(letters, 24), collapse = ""))) tw_create_cache_folder(ask = FALSE) tw_disable_cache() search_from_api <- tw_search(search = "Sylvia Pankhurst", include_search = TRUE) search_from_cache <- tw_get_cached_search("Sylvia Pankhurst") nrow(search_from_cache) == 0 # expect TRUE, as nothing has yet been stored in cache tw_write_search_to_cache(search_df = search_from_api) search_from_cache <- tw_get_cached_search("Sylvia Pankhurst") search_from_cache ## End(Not run)
Mostly used internally by tidywikidatar, use with caution to keep caching consistent.
tw_write_wikipedia_category_members_to_cache( df, language = tidywikidatar::tw_get_language(), type = "page", cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_write_wikipedia_category_members_to_cache( df, language = tidywikidatar::tw_get_language(), type = "page", cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
df |
A data frame typically generated with |
language |
Defaults to language set with |
type |
Defaults to "page", defines which kind of members of a category are returned. Valid values include "page", "file", and "subcat" (for sub-category). Corresponds to |
cache |
Defaults to NULL. If given, it should be given either TRUE or FALSE. Typically set with |
overwrite_cache |
Logical, defaults to FALSE. If TRUE, it overwrites the table in the local sqlite database. Useful if the original Wikidata object has been updated. |
cache_connection |
Defaults to NULL. If NULL, and caching is enabled, |
disconnect_db |
Defaults to TRUE. If FALSE, leaves the connection to cache open. |
Silently returns the same data frame provided as input. Mostly used internally for its side effects.
if (interactive()) { df <- tw_get_wikipedia_category_members( category = "American women anthropologists", language = "en", cache = FALSE ) tw_write_wikipedia_category_members_to_cache( df = df, language = "en" ) }if (interactive()) { df <- tw_get_wikipedia_category_members( category = "American women anthropologists", language = "en", cache = FALSE ) tw_write_wikipedia_category_members_to_cache( df = df, language = "en" ) }
Mostly used internally by tidywikidatar, use with caution to keep caching
consistent.
tw_write_wikipedia_page_links_to_cache( df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_write_wikipedia_page_links_to_cache( df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
df |
A data frame typically generated with
|
language |
Two-letter language code used to define the Wikipedia version
to use. Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
Silently returns the same data frame provided as input. Mostly used internally for its side effects.
if (interactive()) { df <- tw_get_wikipedia_page_links( title = "Margaret Mead", language = "en", cache = FALSE ) tw_write_wikipedia_page_links_to_cache( df = df, language = "en" ) }if (interactive()) { df <- tw_get_wikipedia_page_links( title = "Margaret Mead", language = "en", cache = FALSE ) tw_write_wikipedia_page_links_to_cache( df = df, language = "en" ) }
Mostly used internally by tidywikidatar, use with caution to keep caching
consistent.
tw_write_wikipedia_page_sections_to_cache( df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )tw_write_wikipedia_page_sections_to_cache( df, language = tidywikidatar::tw_get_language(), cache = NULL, overwrite_cache = FALSE, cache_connection = NULL, disconnect_db = TRUE )
df |
A data frame typically generated with
|
language |
Defaults to language set with |
cache |
Defaults to |
overwrite_cache |
Logical, defaults to |
cache_connection |
Defaults to |
disconnect_db |
Defaults to |
Silently returns the same data frame provided as input. Mostly used internally for its side effects.
if (interactive()) { df <- tw_get_wikipedia_page_sections( title = "Margaret Mead", language = "en", cache = FALSE ) tw_write_wikipedia_page_sections_to_cache( df = df, language = "en" ) }if (interactive()) { df <- tw_get_wikipedia_page_sections( title = "Margaret Mead", language = "en", cache = FALSE ) tw_write_wikipedia_page_sections_to_cache( df = df, language = "en" ) }