Package 'patentr' reference manual

Title:	Access USPTO Bulk Data in Tidy Rectangular Format
Description:	Converts TXT and XML data curated by the United States Patent and Trademark Office (USPTO). Allows conversion of bulk data after downloading directly from the USPTO bulk data website, eliminating need for users to wrangle multiple data formats to get large patent databases in tidy, rectangular format. Data details can be found on the USPTO website <https://bulkdata.uspto.gov/>. Currently, all 3 formats: 1. TXT data (1976-2001); 2. XML format 1 data (2002-2004); and 3. XML format 2 data (2005-current) can be converted to rectangular, CSV format. Relevant literature that uses data from USPTO includes Wada (2020) <doi:10.1007/s11192-020-03674-4> and Plaza & Albert (2008) <doi:10.1007/s11192-007-1763-3>.
Authors:	Raoul Wadhwa [aut, cre] (ORCID: <https://orcid.org/0000-0003-0503-9580>), James Yu [aut], Hayley Beltz [aut], Milind Desai [aut], Jacob Scott [aut], Peter Erdi [aut]
Maintainer:	Raoul Wadhwa <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.4
Built:	2026-05-21 10:09:45 UTC
Source:	https://github.com/jyprojs/patentr

Get Bulk Patent Data from USPTO

Description

Download and convert bulk patent data to tidy format from the USPTO website <https://bulkdata.uspto.gov>. Data can be returned as a data frame or written to a file (see 'output_file' parameter). Since USPTO issues patents weekly, at minimum, all patents from a given week must be acquired at once.

Usage

get_bulk_patent_data(year, week, output_file)
get_bulk_patent_data(year, week, output_file)

Arguments

year

integer vector containing years from which patents should be collected

week

integer vector of weeks within the corresponding 'year' element from which patents should be collected

output_file

variable of class 'character'; will output to that file in CSV format

Value

either 'TRUE' (placeholder) or object of class 'data.frame' (see param 'output_file' for details)

Examples

## NOTE: none of the examples are run due to the download requirement
## Not run: 
# download patents from the first week of 1976 and get data frame
patent_data <- get_bulk_patent_data(year = 1976, week = 1)

# download patents from the last 5 weeks of 1980 (and write to a file)
get_bulk_patent_data(year = rep(1980, 5), week = 48:52,
                     output_file = "patent-data.csv")

## End(Not run)
## NOTE: none of the examples are run due to the download requirement
## Not run: 
# download patents from the first week of 1976 and get data frame
patent_data <- get_bulk_patent_data(year = 1976, week = 1)

# download patents from the last 5 weeks of 1980 (and write to a file)
get_bulk_patent_data(year = rep(1980, 5), week = 48:52,
                     output_file = "patent-data.csv")

## End(Not run)

Get Patient Number from WKU

Description

Convert WKU identifier provided in bulk patent files to patent number used in most sources. The References provided in bulk patent files are also in patent number format, not in WKU format.

Usage

wku_to_pno(wku)
wku_to_pno(wku)

Arguments

wku

character vector containing patent WKUs

Value

character vector containing patent numbers

Examples

# convert sample WKUs to patent number and print
sample_wku <- c("RE028671", "03930271")
print(wku_to_pno(sample_wku))
# convert sample WKUs to patent number and print
sample_wku <- c("RE028671", "03930271")
print(wku_to_pno(sample_wku))

Patents issued in week 1 of the year 1976.

Description

A dataset containing information about patents issued by the United States Patent and Trademark Office (USPTO) <https://www.uspto.gov/> in the first week of the year 1976. This can be recreated by running the 'get_bulk_patent_data' function in the 'patentr' package and setting the 'year' and 'week' parameters to '1976' and '1', respectively.

Usage

y1976w1
y1976w1

Format

A data frame with 1379 rows and 9 variables:

WKU: unique patient identifier
Title: patent title
App_Date: date on which patent application was submitted
Issue_Date: date on which patent was issued by USPTO
Inventor: patent inventor(s)
Assignee: person(s)/corporation(s) to whom the patent was assigned
ICL_Class: patent classification based on IPC system
References: patents referenced by this patent
Claims: free-text claims made about value of this patent

Source

https://www.uspto.gov/

Package 'patentr'

Help Index

Get Bulk Patent Data from USPTO

Description

Usage

Arguments

Value

Examples

Get Patient Number from WKU

Description

Usage

Arguments

Value

Examples

Patents issued in week 1 of the year 1976.

Description

Usage

Format

Source