Variety of functions

tax.names_list(rank='genus', size=10)

Get a random vector of species names.

Parameters:
  • rank – Taxonomic rank, one of species, genus (default), family, order.
  • size – Number of names to get. Maximum depends on the rank.

Usage:

import pytaxize
pytaxize.names_list()
pytaxize.names_list('species')
pytaxize.names_list('family')
pytaxize.names_list('order')
pytaxize.names_list('order', 2)
pytaxize.names_list('order', 15)

Search the CANADENSYS Vascan API.

Parameters:
  • q – Taxonomic rank, one of species, genus (default), family, order.
  • format – Number of names to get. Maximum depends on the rank.
  • raw – Raw data or not (default)
  • callopts – Further args passed to request

Usage:

import pytaxize
pytaxize.vascan_search(q = ["Helianthus annuus"])
pytaxize.vascan_search(q = ["Helianthus annuus"], raw=True)
pytaxize.vascan_search(q = ["Helianthus annuus", "Crataegus dodgei"], raw=True)

# format type
## json
pytaxize.vascan_search(q = ["Helianthus annuus"], format="json", raw=True)

## xml
pytaxize.vascan_search(q = ["Helianthus annuus"], format="xml", raw=True)

# lots of names, in this case 50
splist = pytaxize.names_list(rank='species', size=50)
pytaxize.vascan_search(q = splist)
tax.gbif_parse(scientificname)

Parse taxon names using the GBIF name parser.

Parameters:scientificname – A character vector of scientific names. Returns a DataFrame containing fields extracted from parsed taxon names. Fields returned are the union of fields extracted from all species names in scientificname

Author John Baumgartner (johnbb@student.unimelb.edu.au)

References http://dev.gbif.org/wiki/display/POR/Webservice+API, http://tools.gbif.org/nameparser/api.do

Usage:

import pytaxize
pytaxize.gbif_parse(scientificname=['x Agropogon littoralis'])
tax.scrapenames(url=None, file=None, text=None, engine=None, unique=None, verbatim=None, detect_language=None, all_data_sources=None, data_source_ids=None)

Resolve names using Global Names Recognition and Discovery.

Uses the Global Names Recognition and Discovery service, see http://gnrd.globalnames.org/.

Parameters:
  • url – An encoded URL for a web page, PDF, Microsoft Office document, or image file, see examples
  • file – When using multipart/form-data as the content-type, a file may be sent. This should be a path to your file on your machine.
  • text – Type: string. Text content; best used with a POST request, see examples
  • engine – (optional) Type: integer, Default: 0. Either 1 for TaxonFinder, 2 for NetiNeti, or 0 for both. If absent, both engines are used.
  • unique – (optional) Type: boolean. If TRUE (default), response has unique names without offsets.
  • verbatim – (optional) Type: boolean, If TRUE (default to FALSE), response excludes verbatim strings.
  • detect_language – (optional) Type: boolean, When TRUE (default), NetiNeti is not used if the language of incoming text is determined not to be English. When ‘false’, NetiNeti will be used if requested.
  • all_data_sources – (optional) Type: bolean. Resolve found names against all available Data Sources.
  • data_source_ids – (optional) Type: string. Pipe separated list of data source ids to resolve found names against. See list of Data Sources.

Usage:

# Get data from a website using its URL
out = pytaxize.scrapenames(url = 'http://en.wikipedia.org/wiki/Araneae')
out['data'].head() # data
out['meta'] # metadata

# Scrape names from a pdf at a URL
out = pytaxize.scrapenames(url = 'http://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf')
out['data'].head() # data
out['meta'] # metadata

# With arguments
pytaxize.scrapenames(url = 'http://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf',
unique=TRUE)
pytaxize.scrapenames(url = 'http://www.mapress.com/zootaxa/2012/f/z03372p265f.pdf', all_data_sources=TRUE)

# Get data from text string as an R object
pytaxize.scrapenames(text='A spider named Pardosa moesta Banks, 1892')