Title: | Request Scheduler |
---|---|
Description: | Offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP (Simple Object Access Protocol) or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult class permits to get the request status to handle error cases and the content. |
Authors: | Pierrick Roger [aut, cre]
|
Maintainer: | Pierrick Roger <[email protected]> |
License: | AGPL-3 |
Version: | 1.0.3 |
Built: | 2025-01-31 03:23:37 UTC |
Source: | https://github.com/cran/sched |
Offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP (Simple Object Access Protocol) or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult class permits to get the request status to handle error cases and the content.
sched package.
sched offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult class permits to get the request status to handle error cases and the content.
Maintainer: Pierrick Roger [email protected] (ORCID)
Send the request described by a Request instance, using the provided user agent, and return the results.
get_url_request_result( request, useragent = NULL, ssl_verifypeer = TRUE, binary = FALSE )
get_url_request_result( request, useragent = NULL, ssl_verifypeer = TRUE, binary = FALSE )
request |
A |
useragent |
The user agent, as a character value. Example: "myapp ; [email protected]" |
ssl_verifypeer |
Set to |
binary |
Set to TRUE if the content to be retrieved is binary. |
The request result, as a character value.
# Retrieve the content of a web page u <- sched::URL$new('https://httpbin.org/get') content <- sched::get_url_request_result(sched::Request$new(u))
# Retrieve the content of a web page u <- sched::URL$new('https://httpbin.org/get') content <- sched::get_url_request_result(sched::Request$new(u))
Construct a sched::Request object with a valid header for a POST request.
make_post_request(url, body, mime, soap_action = NULL, encoding = NULL)
make_post_request(url, body, mime, soap_action = NULL, encoding = NULL)
url |
A |
body |
The body of the POST request. |
mime |
The MIME type of the body. Example: "text/xml", "application/json". |
soap_action |
In case of a SOAP request, the SOAP action to contact, as a character string. |
encoding |
The encoding to use. A valid integer or string as required by RCurl. |
A sched::Request object.
# Prepare the URL and the request body the_url <- sched::URL$new('https://httpbin.org/anything') the_body <- '{"some_key": "my_value"}' # Make the request object my_request <- sched::make_post_request(the_url, body = the_body, mime = "application/json")
# Prepare the URL and the request body the_url <- sched::URL$new('https://httpbin.org/anything') the_body <- '{"some_key": "my_value"}' # Make the request object my_request <- sched::make_post_request(the_url, body = the_body, mime = "application/json")
Class Request.
Class Request.
This class represents a Request object that can be used with the Request Scheduler.
new()
Initializer.
Request$new( url, method = c("get", "post"), header = NULL, body = NULL, encoding = NULL )
url
A sched::URL
object.
method
HTTP method. Either "get" or "post".
header
The header of the POST method as a named character vector. The names are the fields of the header.
body
The body as a character single value.
encoding
The encoding to use. A valid integer or string as required by RCurl.
Nothing.
# Create a GET request for the getCompleteEntity webservice of ChEBI # database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}')
getUrl()
Gets the URL.
Request$getUrl()
The URL of this Request object as a sched::URL object.
# Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored URL object print(request$getUrl())
getMethod()
Gets the method.
Request$getMethod()
The method as a character value.
# Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored method print(request$getMethod())
getEncoding()
Gets the encoding.
Request$getEncoding()
The encoding.
# Create a GET request request <- sched::Request$new(sched::URL$new('https://my.site.fr/'), encoding='UTF-8') # Get the stored encoding print(request$getEncoding())
getCurlOptions()
Gets the options object to pass to cURL library.
Make a RCurl::CURLOptions object by calling RCurl::curlOptions() function. Useragent, header and body are passed as options if not NULL.
Request$getCurlOptions(useragent = NULL)
useragent
The user agent as a character value, or NULL.
An RCurl::CURLOptions object.
# Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get the associated RCurl options object rcurl_opts <- request$getCurlOptions('myapp ; [email protected]')
getUniqueKey()
Gets a unique key to identify this request.
The key is an MD5 sum computed from the string representation of this request.
Request$getUniqueKey()
A unique key as an MD5 sum.
# Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the MD5 sum of this request print(request$getUniqueKey())
getHeaderAsSingleString()
Gets the HTTP header as a string, concatenating all its information into a single string.
Request$getHeaderAsSingleString()
The header as a single character value.
# Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST header as a single string print(request$getHeaderAsSingleString())
getBody()
Gets the body.
Request$getBody()
The body as a single character value.
# Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST body print(request$getBody())
print()
Displays information about this instance.
Request$print()
self as invisible.
# Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Print the Request object print(request)
toString()
Gets a string representation of this instance.
Request$toString()
A single string giving a representation of this instance.
# Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the string representation of this request print(request$toString())
clone()
The objects of this class are cloneable with this method.
Request$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create a GET request for the getCompleteEntity webservice of ChEBI database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Get an MD5 key, unique to this request key <- request$getUniqueKey() # Print the request print(request) ## ------------------------------------------------ ## Method `Request$new` ## ------------------------------------------------ # Create a GET request for the getCompleteEntity webservice of ChEBI # database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') ## ------------------------------------------------ ## Method `Request$getUrl` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored URL object print(request$getUrl()) ## ------------------------------------------------ ## Method `Request$getMethod` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored method print(request$getMethod()) ## ------------------------------------------------ ## Method `Request$getEncoding` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://my.site.fr/'), encoding='UTF-8') # Get the stored encoding print(request$getEncoding()) ## ------------------------------------------------ ## Method `Request$getCurlOptions` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get the associated RCurl options object rcurl_opts <- request$getCurlOptions('myapp ; [email protected]') ## ------------------------------------------------ ## Method `Request$getUniqueKey` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the MD5 sum of this request print(request$getUniqueKey()) ## ------------------------------------------------ ## Method `Request$getHeaderAsSingleString` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST header as a single string print(request$getHeaderAsSingleString()) ## ------------------------------------------------ ## Method `Request$getBody` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST body print(request$getBody()) ## ------------------------------------------------ ## Method `Request$print` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Print the Request object print(request) ## ------------------------------------------------ ## Method `Request$toString` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the string representation of this request print(request$toString())
# Create a GET request for the getCompleteEntity webservice of ChEBI database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Get an MD5 key, unique to this request key <- request$getUniqueKey() # Print the request print(request) ## ------------------------------------------------ ## Method `Request$new` ## ------------------------------------------------ # Create a GET request for the getCompleteEntity webservice of ChEBI # database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') ## ------------------------------------------------ ## Method `Request$getUrl` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored URL object print(request$getUrl()) ## ------------------------------------------------ ## Method `Request$getMethod` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored method print(request$getMethod()) ## ------------------------------------------------ ## Method `Request$getEncoding` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://my.site.fr/'), encoding='UTF-8') # Get the stored encoding print(request$getEncoding()) ## ------------------------------------------------ ## Method `Request$getCurlOptions` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get the associated RCurl options object rcurl_opts <- request$getCurlOptions('myapp ; [email protected]') ## ------------------------------------------------ ## Method `Request$getUniqueKey` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the MD5 sum of this request print(request$getUniqueKey()) ## ------------------------------------------------ ## Method `Request$getHeaderAsSingleString` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST header as a single string print(request$getHeaderAsSingleString()) ## ------------------------------------------------ ## Method `Request$getBody` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST body print(request$getBody()) ## ------------------------------------------------ ## Method `Request$print` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Print the Request object print(request) ## ------------------------------------------------ ## Method `Request$toString` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the string representation of this request print(request$toString())
Class RequestResult.
Class RequestResult.
Represents the result of a request.
new()
New instance initializer.
RequestResult$new( content = NULL, retry = FALSE, err_msg = NULL, status = 0, status_msg = "", retry_after = NULL, location = NULL )
content
The result content.
retry
If request should be resent.
err_msg
Error message.
status
HTTP status.
status_msg
Status message.
retry_after
Time after which to retry.
location
New location.
Nothing.
getContent()
Get content.
RequestResult$getContent()
The content as a character value or NULL.
getRetry()
Get the retry flag.
RequestResult$getRetry()
TRUE if the URL request should be sent again, FALSE otherwise.
getErrMsg()
Get the error message.
RequestResult$getErrMsg()
The error message as a character value or NULL.
getStatus()
Get the HTTP status of the response.
RequestResult$getStatus()
The status as an integer.
getRetryAfter()
Get the time to wait before retrying.
RequestResult$getRetryAfter()
The time.
getLocation()
Get the redirect location.
RequestResult$getLocation()
The redirect location as a character value or NULL.
processRequestErrors()
Process possible HTTP error.
RequestResult$processRequestErrors()
Nothing.
clone()
The objects of this class are cloneable with this method.
RequestResult$clone(deep = FALSE)
deep
Whether to make a deep clone.
Scheduling rule class.
Scheduling rule class.
This class represents a scheduling rule, used to limit the number of events during a certain lap of time.
new()
Initializer.
Rule$new(n = 3L, lap = 1)
n
Number of events during a time lap.
lap
Duration of a time lap, in seconds.
Nothing.
# Create a rule object with default parameters r <- Rule$new() # Create a rule object with 5 events allowed each second (default time) r2 <- Rule$new(5L) # Create a rule object with 5 events allowed each 3 seconds r3 <- Rule$new(5L, 3)
getN()
Gets the number of events allowed during a lap time.
Rule$getN()
Returns the number of events as an integer.
r <- Rule$new() #' Get the allowed number of events for a rule print(r$getN())
getLapTime()
Gets the lap time.
The number of seconds during which N events are allowed.
Rule$getLapTime()
Returns Lap time as a numeric.
# Create a rule object with default parameters r <- Rule$new() #' Get the configured lap time for a rule print(r$getLapTime())
print()
Displays information about this instance.
Rule$print()
Nothing.
# Create a rule object with default parameters r <- Rule$new() # Print information about a rule object print(r)
wait()
Wait (sleep) until a new event is allowed.
Rule$wait(do_sleep = TRUE)
do_sleep
Debug parameter that turns off the call to Sys.sleep(). Use only for testing.
The time passed to wait, in seconds.
# Create a rule object that allows 3 events each 0.02 seconds r <- Rule$new(3, 0.02) #' Loop for generating 20 events i <- 0 # event index while (i < 20) { # Wait until next event is allowed wait_time <- r$wait() print(paste("We have waited", wait_time, "second(s) and are now allowed to process event number", i)) i <- i + 1 }
clone()
The objects of this class are cloneable with this method.
Rule$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create a new Rule object: rule <- sched::Rule$new(n=1,lap=0.2) # 1 event allowed each 2 seconds # Wait to be allowed to process with first event: rule$wait() # The first event will be allowed directly, no waiting time. # Process your first event here rule$wait() # The second event will be delayed 0.2 seconds. This time # includes the time passed between the first call to wait() and # this one. # Process your second event here ## ------------------------------------------------ ## Method `Rule$new` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Create a rule object with 5 events allowed each second (default time) r2 <- Rule$new(5L) # Create a rule object with 5 events allowed each 3 seconds r3 <- Rule$new(5L, 3) ## ------------------------------------------------ ## Method `Rule$getN` ## ------------------------------------------------ r <- Rule$new() #' Get the allowed number of events for a rule print(r$getN()) ## ------------------------------------------------ ## Method `Rule$getLapTime` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() #' Get the configured lap time for a rule print(r$getLapTime()) ## ------------------------------------------------ ## Method `Rule$print` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Print information about a rule object print(r) ## ------------------------------------------------ ## Method `Rule$wait` ## ------------------------------------------------ # Create a rule object that allows 3 events each 0.02 seconds r <- Rule$new(3, 0.02) #' Loop for generating 20 events i <- 0 # event index while (i < 20) { # Wait until next event is allowed wait_time <- r$wait() print(paste("We have waited", wait_time, "second(s) and are now allowed to process event number", i)) i <- i + 1 }
# Create a new Rule object: rule <- sched::Rule$new(n=1,lap=0.2) # 1 event allowed each 2 seconds # Wait to be allowed to process with first event: rule$wait() # The first event will be allowed directly, no waiting time. # Process your first event here rule$wait() # The second event will be delayed 0.2 seconds. This time # includes the time passed between the first call to wait() and # this one. # Process your second event here ## ------------------------------------------------ ## Method `Rule$new` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Create a rule object with 5 events allowed each second (default time) r2 <- Rule$new(5L) # Create a rule object with 5 events allowed each 3 seconds r3 <- Rule$new(5L, 3) ## ------------------------------------------------ ## Method `Rule$getN` ## ------------------------------------------------ r <- Rule$new() #' Get the allowed number of events for a rule print(r$getN()) ## ------------------------------------------------ ## Method `Rule$getLapTime` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() #' Get the configured lap time for a rule print(r$getLapTime()) ## ------------------------------------------------ ## Method `Rule$print` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Print information about a rule object print(r) ## ------------------------------------------------ ## Method `Rule$wait` ## ------------------------------------------------ # Create a rule object that allows 3 events each 0.02 seconds r <- Rule$new(3, 0.02) #' Loop for generating 20 events i <- 0 # event index while (i < 20) { # Wait until next event is allowed wait_time <- r$wait() print(paste("We have waited", wait_time, "second(s) and are now allowed to process event number", i)) i <- i + 1 }
Class for scheduling web requests.
Class for scheduling web requests.
The Scheduler class controls the frequency of access to web sites, through
the definiton of access rules (Rule
class).
It handles GET and POST requests, as well as file downloading.
It can use a cache system to store request results and avoid resending
identical requests.
new()
New instance initializer.
There should be only one Scheduler instance in an application. There is no sense in having two or more instances, since they will ignore each other and break the access frequency rules when they contact the same sites.
Scheduler$new( default_rule = Rule$new(), ssl_verifypeer = TRUE, nb_max_tries = 10L, cache_dir = tools::R_user_dir("sched", which = "cache"), user_agent = NULL, dwnld_timeout = 3600 )
default_rule
The default_rule to use when none has been defined for a site.
ssl_verifypeer
If set to TRUE (default), SSL certificate will be checked, otherwise certificates will be ignored.
nb_max_tries
Maximum number of tries when running a request.
cache_dir
Set the path to the file system cache. Set to NULL to disable the cache system. The cache system will save downloaded content and reuse it later for identical requests.
user_agent
The application name and contact address to send to the contacted web server.
dwnld_timeout
The timeout used by downloadFile()
method, in
seconds.
Nothing.
# Create a scheduler instance with a custom default_rule scheduler <- sched::Scheduler$new(default_rule=sched::Rule$new(10, 1), cache_dir = NULL)
setRule()
Defines a rule for a site.
Defines a rule for a site. The site is identified by its hostname. Each time a request will be made to this host (i.e.: the URL contains the defined hostname), the scheduling rule will be applied in order to wait (sleep) if nedeed before sending the request.
If a rule already exists for this hostname, it will be replaced.
Scheduler$setRule(host, n = 3L, lap = 1)
host
The hostname of the site.
n
Number of events during a time lap.
lap
Duration of a time lap, in seconds.
Nothing.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3)
sendRequest()
Sends a request, and retrieves content result.
Scheduler$sendRequest(request, cache_read = TRUE)
request
A sched::Request
instance.
cache_read
If set to TRUE and the cache system is enabled, the cache system will be searched for the request and the cached result returned. In any case, if the the cache system is enabled, and the request sent, the retrieved content will be stored into the cache.
The results returned by the contacted server, as a single string value.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a scheduling rule of 7 requests every 2 seconds scheduler$setRule('www.ebi.ac.uk', n=7, lap=2) # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request)
downloadFile()
Downloads the content of a URL and save it into the specified destination file.
This method works for any URL, even if it has been written with heavy
files in mind.
Since it uses utils::download.file()
which saves the content
directly on disk, the cache system is not used.
Scheduler$downloadFile(url, dest_file, quiet = FALSE, timeout = NULL)
url
The URL to access, as a sched::URL object.
dest_file
A path to a destination file.
quiet
The quiet parameter for utils::download.file()
.
timeout
The timeout in seconds. Defaults to value provided in initializer.
Nothing.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a temporary directory tmp_dir <- tempdir() # Download a file u <- sched::URL$new( 'https://gitlab.com/cnrgh/databases/r-sched/-/raw/main/README.md', c(ref_type='heads')) scheduler$downloadFile(u, file.path(tmp_dir, 'README.md')) # Remove the temporary directory unlink(tmp_dir, recursive = TRUE)
getUrlString()
Builds a URL string, using a base URL and parameters to be passed.
The provided base URL and parameters are combined into a full URL string.
DEPRECATED. Use the sched::URL
class and its method
toString()
instead.
Scheduler$getUrlString(url, params = list())
url
A URL string.
params
A list of URL parameters.
The full URL string as a single character value.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a URL string url.str <- scheduler$getUrlString( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))
getUrl()
Sends a request and get the result.
DEPRECATED. Use method sendRequest()
instead.
Scheduler$getUrl( url, params = list(), method = c("get", "post"), header = NULL, body = NULL, encoding = NULL )
url
A URL string.
params
A list of URL parameters.
method
The method to use. Either 'get' or 'post'.
header
The header to send.
body
The body to send.
encoding
The encoding to use.
The results of the request.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Send request content <- scheduler$getUrl( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))
deleteRules()
Removes all defined rules, including the ones automatically defined using default_rule.
Scheduler$deleteRules()
Nothing.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) # Delete all defined rules scheduler$deleteRules()
getNbRules()
Gets the number of defined rules, including the ones automatically defined using default_rule.
Scheduler$getNbRules()
The number of rules defined.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Get the number of defined rules print(scheduler$getNbRules())
setOffline()
Enables or disables offline mode.
If the offline mode is enabled, an error will be raised when the class attemps to send a request. This mode is mainly useful when debugging the usage of the cache system.
Scheduler$setOffline(offline)
offline
Set to TRUE to enable offline mode, and FALSE otherwise.
Nothing.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Enable offline mode scheduler$setOffline(TRUE)
isOffline()
Tests if offline mode is enabled.
Scheduler$isOffline()
TRUE is offline mode is enabled, FALSE otherwise.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Test if offline mode is enabled if (scheduler$isOffline()) print("Scheduler is offline.")
clone()
The objects of this class are cloneable with this method.
Scheduler$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create a scheduler instance without cache scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$new` ## ------------------------------------------------ # Create a scheduler instance with a custom default_rule scheduler <- sched::Scheduler$new(default_rule=sched::Rule$new(10, 1), cache_dir = NULL) ## ------------------------------------------------ ## Method `Scheduler$setRule` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) ## ------------------------------------------------ ## Method `Scheduler$sendRequest` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a scheduling rule of 7 requests every 2 seconds scheduler$setRule('www.ebi.ac.uk', n=7, lap=2) # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$downloadFile` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a temporary directory tmp_dir <- tempdir() # Download a file u <- sched::URL$new( 'https://gitlab.com/cnrgh/databases/r-sched/-/raw/main/README.md', c(ref_type='heads')) scheduler$downloadFile(u, file.path(tmp_dir, 'README.md')) # Remove the temporary directory unlink(tmp_dir, recursive = TRUE) ## ------------------------------------------------ ## Method `Scheduler$getUrlString` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a URL string url.str <- scheduler$getUrlString( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$getUrl` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Send request content <- scheduler$getUrl( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$deleteRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) # Delete all defined rules scheduler$deleteRules() ## ------------------------------------------------ ## Method `Scheduler$getNbRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Get the number of defined rules print(scheduler$getNbRules()) ## ------------------------------------------------ ## Method `Scheduler$setOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Enable offline mode scheduler$setOffline(TRUE) ## ------------------------------------------------ ## Method `Scheduler$isOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Test if offline mode is enabled if (scheduler$isOffline()) print("Scheduler is offline.")
# Create a scheduler instance without cache scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$new` ## ------------------------------------------------ # Create a scheduler instance with a custom default_rule scheduler <- sched::Scheduler$new(default_rule=sched::Rule$new(10, 1), cache_dir = NULL) ## ------------------------------------------------ ## Method `Scheduler$setRule` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) ## ------------------------------------------------ ## Method `Scheduler$sendRequest` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a scheduling rule of 7 requests every 2 seconds scheduler$setRule('www.ebi.ac.uk', n=7, lap=2) # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$downloadFile` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a temporary directory tmp_dir <- tempdir() # Download a file u <- sched::URL$new( 'https://gitlab.com/cnrgh/databases/r-sched/-/raw/main/README.md', c(ref_type='heads')) scheduler$downloadFile(u, file.path(tmp_dir, 'README.md')) # Remove the temporary directory unlink(tmp_dir, recursive = TRUE) ## ------------------------------------------------ ## Method `Scheduler$getUrlString` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a URL string url.str <- scheduler$getUrlString( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$getUrl` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Send request content <- scheduler$getUrl( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$deleteRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) # Delete all defined rules scheduler$deleteRules() ## ------------------------------------------------ ## Method `Scheduler$getNbRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Get the number of defined rules print(scheduler$getNbRules()) ## ------------------------------------------------ ## Method `Scheduler$setOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Enable offline mode scheduler$setOffline(TRUE) ## ------------------------------------------------ ## Method `Scheduler$isOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Test if offline mode is enabled if (scheduler$isOffline()) print("Scheduler is offline.")
URL class.
URL class.
This class represents a URL object that can be used in requests. It handles parameters as a list, making it easy to build URLs for contacting web services.
new()
Initializer.
URL$new(url = character(), params = character(), chomp_extra_slashes = TRUE)
url
The URL to access, as a character vector.
params
The list of parameters to append to this URL. If it is an
unnamed list or vector, the values will be converted to strings and
concatenated with the &
separator. If it is a named list or vector, the
names will be used as keys as in "name1=value1&name2=value2&...".
chomp_extra_slashes
If set to TRUE, then slashes at the end and the beginning of each element of the url vector parameter will be removed before proper concatenation.
Nothing.
# Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc'))
getDomain()
Etracts the domain name from the URL.
URL$getDomain()
The domain.
# Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) # Extract the domain name print(url$getDomain())
setUrl()
Sets the base URL string.
URL$setUrl(url)
url
The base URL string.
Nothing.
# Create an empty URL object url <- sched::URL$new() # Set the URL url$setUrl('https://www.my.server/') # Convert the URL to a string print(url$toString())
setParam()
Sets a parameter.
URL$setParam(key, value)
key
The parameter name.
value
The value of the parameter.
Nothing.
# Create an URL object url <- sched::URL$new('https://www.my.server/') # Set a parameter url$setParam('a', 12) # Convert the URL to a string print(url$toString())
print()
Displays information about this instance.
URL$print()
self as invisible.
# Create an URL object url <- sched::URL$new('https://www.my.server/') # Print the URL object print(url)
toString()
Gets the URL as a string representation.
URL$toString(encode = TRUE)
encode
If set to TRUE, then encodes the URL.
The URL as a string, with all parameters and values set.
# Create an URL object url <- sched::URL$new('https://www.my.server/', c(a=12)) # Convert the URL to a string print(url$toString())
clone()
The objects of this class are cloneable with this method.
URL$clone(deep = FALSE)
deep
Whether to make a deep clone.
# Create a URL object from a base URL string and a list of parameters base.url <- c("https://www.uniprot.org", "uniprot") params <- c(query="reviewed:yes+AND+organism:9606", columns='id,entry name,protein names', format="tab") url <- sched::URL$new(url=base.url, params=params) # Print the URL converted to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$new` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) ## ------------------------------------------------ ## Method `URL$getDomain` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) # Extract the domain name print(url$getDomain()) ## ------------------------------------------------ ## Method `URL$setUrl` ## ------------------------------------------------ # Create an empty URL object url <- sched::URL$new() # Set the URL url$setUrl('https://www.my.server/') # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$setParam` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Set a parameter url$setParam('a', 12) # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$print` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Print the URL object print(url) ## ------------------------------------------------ ## Method `URL$toString` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/', c(a=12)) # Convert the URL to a string print(url$toString())
# Create a URL object from a base URL string and a list of parameters base.url <- c("https://www.uniprot.org", "uniprot") params <- c(query="reviewed:yes+AND+organism:9606", columns='id,entry name,protein names', format="tab") url <- sched::URL$new(url=base.url, params=params) # Print the URL converted to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$new` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) ## ------------------------------------------------ ## Method `URL$getDomain` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) # Extract the domain name print(url$getDomain()) ## ------------------------------------------------ ## Method `URL$setUrl` ## ------------------------------------------------ # Create an empty URL object url <- sched::URL$new() # Set the URL url$setUrl('https://www.my.server/') # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$setParam` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Set a parameter url$setParam('a', 12) # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$print` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Print the URL object print(url) ## ------------------------------------------------ ## Method `URL$toString` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/', c(a=12)) # Convert the URL to a string print(url$toString())