| Title: | Request Scheduler |
|---|---|
| Description: | Offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP (Simple Object Access Protocol) or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult class permits to get the request status to handle error cases and the content. |
| Authors: | Pierrick Roger [aut, cre]
|
| Maintainer: | Pierrick Roger <[email protected]> |
| License: | AGPL-3 |
| Version: | 1.0.3 |
| Built: | 2026-06-03 11:11:19 UTC |
| Source: | https://github.com/cran/sched |
Offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP (Simple Object Access Protocol) or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult class permits to get the request status to handle error cases and the content.
sched package.
sched offers classes and functions to contact web servers while enforcing scheduling rules required by the sites. The URL class makes it easy to construct a URL by providing parameters as a vector. The Request class allows to describes SOAP or standard requests: URL, method (POST or GET), header, body. The Scheduler class controls the request frequency for each server address by mean of rules (Rule class). The RequestResult class permits to get the request status to handle error cases and the content.
Maintainer: Pierrick Roger [email protected] (ORCID)
Send the request described by a Request instance, using the provided user agent, and return the results.
get_url_request_result( request, useragent = NULL, ssl_verifypeer = TRUE, binary = FALSE )get_url_request_result( request, useragent = NULL, ssl_verifypeer = TRUE, binary = FALSE )
request |
A |
useragent |
The user agent, as a character value. Example: "myapp ; [email protected]" |
ssl_verifypeer |
Set to |
binary |
Set to TRUE if the content to be retrieved is binary. |
The request result, as a character value.
# Retrieve the content of a web page u <- sched::URL$new('https://httpbin.org/get') content <- sched::get_url_request_result(sched::Request$new(u))# Retrieve the content of a web page u <- sched::URL$new('https://httpbin.org/get') content <- sched::get_url_request_result(sched::Request$new(u))
Construct a sched::Request object with a valid header for a POST request.
make_post_request(url, body, mime, soap_action = NULL, encoding = NULL)make_post_request(url, body, mime, soap_action = NULL, encoding = NULL)
url |
A |
body |
The body of the POST request. |
mime |
The MIME type of the body. Example: "text/xml", "application/json". |
soap_action |
In case of a SOAP request, the SOAP action to contact, as a character string. |
encoding |
The encoding to use. A valid integer or string as required by RCurl. |
A sched::Request object.
# Prepare the URL and the request body the_url <- sched::URL$new('https://httpbin.org/anything') the_body <- '{"some_key": "my_value"}' # Make the request object my_request <- sched::make_post_request(the_url, body = the_body, mime = "application/json")# Prepare the URL and the request body the_url <- sched::URL$new('https://httpbin.org/anything') the_body <- '{"some_key": "my_value"}' # Make the request object my_request <- sched::make_post_request(the_url, body = the_body, mime = "application/json")
Class Request.
Class Request.
This class represents a Request object that can be used with the Request Scheduler.
new()
Initializer.
Request$new(
url,
method = c("get", "post"),
header = NULL,
body = NULL,
encoding = NULL
)urlA sched::URL object.
methodHTTP method. Either "get" or "post".
headerThe header of the POST method as a named character vector. The names are the fields of the header.
bodyThe body as a character single value.
encodingThe encoding to use. A valid integer or string as required by RCurl.
Nothing.
# Create a GET request for the getCompleteEntity webservice of ChEBI
# database
request <- sched::Request$new(
sched::URL$new(
'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity',
params=c(chebiId=15440)))
# Create a POST Request object for the records-batch-post webservice of
# ChemSpider database
request <- sched::Request$new(
url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records',
'batch')),
method='post', header=c('Content-Type'="", apikey='my-token'),
body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}')
getUrl()
Gets the URL.
Request$getUrl()
The URL of this Request object as a sched::URL object.
# Create a GET request
request <- sched::Request$new(sched::URL$new('https://peakforest.org/'))
# Get the stored URL object
print(request$getUrl())
getMethod()
Gets the method.
Request$getMethod()
The method as a character value.
# Create a GET request
request <- sched::Request$new(sched::URL$new('https://peakforest.org/'))
# Get the stored method
print(request$getMethod())
getEncoding()
Gets the encoding.
Request$getEncoding()
The encoding.
# Create a GET request
request <- sched::Request$new(sched::URL$new('https://my.site.fr/'),
encoding='UTF-8')
# Get the stored encoding
print(request$getEncoding())
getCurlOptions()
Gets the options object to pass to cURL library.
Make a RCurl::CURLOptions object by calling RCurl::curlOptions() function. Useragent, header and body are passed as options if not NULL.
Request$getCurlOptions(useragent = NULL)
useragentThe user agent as a character value, or NULL.
An RCurl::CURLOptions object.
# Create a POST Request object for the records-batch-post webservice of
# ChemSpider database
request <- sched::Request$new(
url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records',
'batch')),
method='post', header=c('Content-Type'="", apikey='my-token'),
body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}')
# Get the associated RCurl options object
rcurl_opts <- request$getCurlOptions('myapp ; [email protected]')
getUniqueKey()
Gets a unique key to identify this request.
The key is an MD5 sum computed from the string representation of this request.
Request$getUniqueKey()
A unique key as an MD5 sum.
# Create a GET request
request <- sched::Request$new(sched::URL$new('https://peakforest.org/'))
# Get the MD5 sum of this request
print(request$getUniqueKey())
getHeaderAsSingleString()
Gets the HTTP header as a string, concatenating all its information into a single string.
Request$getHeaderAsSingleString()
The header as a single character value.
# Create a POST Request object for the records-batch-post webservice of
# ChemSpider database
request <- sched::Request$new(
url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records',
'batch')),
method='post', header=c('Content-Type'="", apikey='my-token'),
body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}')
# Get back the POST header as a single string
print(request$getHeaderAsSingleString())
getBody()
Gets the body.
Request$getBody()
The body as a single character value.
# Create a POST Request object for the records-batch-post webservice of
# ChemSpider database
request <- sched::Request$new(
url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records',
'batch')),
method='post', header=c('Content-Type'="", apikey='my-token'),
body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}')
# Get back the POST body
print(request$getBody())
print()
Displays information about this instance.
Request$print()
self as invisible.
# Create a GET request
request <- sched::Request$new(sched::URL$new('https://peakforest.org/'))
# Print the Request object
print(request)
toString()
Gets a string representation of this instance.
Request$toString()
A single string giving a representation of this instance.
# Create a GET request
request <- sched::Request$new(sched::URL$new('https://peakforest.org/'))
# Get the string representation of this request
print(request$toString())
clone()
The objects of this class are cloneable with this method.
Request$clone(deep = FALSE)
deepWhether to make a deep clone.
# Create a GET request for the getCompleteEntity webservice of ChEBI database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Get an MD5 key, unique to this request key <- request$getUniqueKey() # Print the request print(request) ## ------------------------------------------------ ## Method `Request$new` ## ------------------------------------------------ # Create a GET request for the getCompleteEntity webservice of ChEBI # database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') ## ------------------------------------------------ ## Method `Request$getUrl` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored URL object print(request$getUrl()) ## ------------------------------------------------ ## Method `Request$getMethod` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored method print(request$getMethod()) ## ------------------------------------------------ ## Method `Request$getEncoding` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://my.site.fr/'), encoding='UTF-8') # Get the stored encoding print(request$getEncoding()) ## ------------------------------------------------ ## Method `Request$getCurlOptions` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get the associated RCurl options object rcurl_opts <- request$getCurlOptions('myapp ; [email protected]') ## ------------------------------------------------ ## Method `Request$getUniqueKey` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the MD5 sum of this request print(request$getUniqueKey()) ## ------------------------------------------------ ## Method `Request$getHeaderAsSingleString` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST header as a single string print(request$getHeaderAsSingleString()) ## ------------------------------------------------ ## Method `Request$getBody` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST body print(request$getBody()) ## ------------------------------------------------ ## Method `Request$print` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Print the Request object print(request) ## ------------------------------------------------ ## Method `Request$toString` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the string representation of this request print(request$toString())# Create a GET request for the getCompleteEntity webservice of ChEBI database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Get an MD5 key, unique to this request key <- request$getUniqueKey() # Print the request print(request) ## ------------------------------------------------ ## Method `Request$new` ## ------------------------------------------------ # Create a GET request for the getCompleteEntity webservice of ChEBI # database request <- sched::Request$new( sched::URL$new( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))) # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') ## ------------------------------------------------ ## Method `Request$getUrl` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored URL object print(request$getUrl()) ## ------------------------------------------------ ## Method `Request$getMethod` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the stored method print(request$getMethod()) ## ------------------------------------------------ ## Method `Request$getEncoding` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://my.site.fr/'), encoding='UTF-8') # Get the stored encoding print(request$getEncoding()) ## ------------------------------------------------ ## Method `Request$getCurlOptions` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get the associated RCurl options object rcurl_opts <- request$getCurlOptions('myapp ; [email protected]') ## ------------------------------------------------ ## Method `Request$getUniqueKey` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the MD5 sum of this request print(request$getUniqueKey()) ## ------------------------------------------------ ## Method `Request$getHeaderAsSingleString` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST header as a single string print(request$getHeaderAsSingleString()) ## ------------------------------------------------ ## Method `Request$getBody` ## ------------------------------------------------ # Create a POST Request object for the records-batch-post webservice of # ChemSpider database request <- sched::Request$new( url=sched::URL$new(c('https://api.rsc.org/compounds/v1/', 'records', 'batch')), method='post', header=c('Content-Type'="", apikey='my-token'), body='{"recordIds": [2], "fields": ["SMILES","Formula","InChI"]}') # Get back the POST body print(request$getBody()) ## ------------------------------------------------ ## Method `Request$print` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Print the Request object print(request) ## ------------------------------------------------ ## Method `Request$toString` ## ------------------------------------------------ # Create a GET request request <- sched::Request$new(sched::URL$new('https://peakforest.org/')) # Get the string representation of this request print(request$toString())
Class RequestResult.
Class RequestResult.
Represents the result of a request.
new()
New instance initializer.
RequestResult$new( content = NULL, retry = FALSE, err_msg = NULL, status = 0, status_msg = "", retry_after = NULL, location = NULL )
contentThe result content.
retryIf request should be resent.
err_msgError message.
statusHTTP status.
status_msgStatus message.
retry_afterTime after which to retry.
locationNew location.
Nothing.
getContent()
Get content.
RequestResult$getContent()
The content as a character value or NULL.
getRetry()
Get the retry flag.
RequestResult$getRetry()
TRUE if the URL request should be sent again, FALSE otherwise.
getErrMsg()
Get the error message.
RequestResult$getErrMsg()
The error message as a character value or NULL.
getStatus()
Get the HTTP status of the response.
RequestResult$getStatus()
The status as an integer.
getRetryAfter()
Get the time to wait before retrying.
RequestResult$getRetryAfter()
The time.
getLocation()
Get the redirect location.
RequestResult$getLocation()
The redirect location as a character value or NULL.
processRequestErrors()
Process possible HTTP error.
RequestResult$processRequestErrors()
Nothing.
clone()
The objects of this class are cloneable with this method.
RequestResult$clone(deep = FALSE)
deepWhether to make a deep clone.
Scheduling rule class.
Scheduling rule class.
This class represents a scheduling rule, used to limit the number of events during a certain lap of time.
new()
Initializer.
Rule$new(n = 3L, lap = 1)
nNumber of events during a time lap.
lapDuration of a time lap, in seconds.
Nothing.
# Create a rule object with default parameters r <- Rule$new() # Create a rule object with 5 events allowed each second (default time) r2 <- Rule$new(5L) # Create a rule object with 5 events allowed each 3 seconds r3 <- Rule$new(5L, 3)
getN()
Gets the number of events allowed during a lap time.
Rule$getN()
Returns the number of events as an integer.
r <- Rule$new() #' Get the allowed number of events for a rule print(r$getN())
getLapTime()
Gets the lap time.
The number of seconds during which N events are allowed.
Rule$getLapTime()
Returns Lap time as a numeric.
# Create a rule object with default parameters r <- Rule$new() #' Get the configured lap time for a rule print(r$getLapTime())
print()
Displays information about this instance.
Rule$print()
Nothing.
# Create a rule object with default parameters r <- Rule$new() # Print information about a rule object print(r)
wait()
Wait (sleep) until a new event is allowed.
Rule$wait(do_sleep = TRUE)
do_sleepDebug parameter that turns off the call to Sys.sleep(). Use only for testing.
The time passed to wait, in seconds.
# Create a rule object that allows 3 events each 0.02 seconds
r <- Rule$new(3, 0.02)
#' Loop for generating 20 events
i <- 0 # event index
while (i < 20) {
# Wait until next event is allowed
wait_time <- r$wait()
print(paste("We have waited", wait_time,
"second(s) and are now allowed to process event number", i))
i <- i + 1
}
clone()
The objects of this class are cloneable with this method.
Rule$clone(deep = FALSE)
deepWhether to make a deep clone.
# Create a new Rule object: rule <- sched::Rule$new(n=1,lap=0.2) # 1 event allowed each 2 seconds # Wait to be allowed to process with first event: rule$wait() # The first event will be allowed directly, no waiting time. # Process your first event here rule$wait() # The second event will be delayed 0.2 seconds. This time # includes the time passed between the first call to wait() and # this one. # Process your second event here ## ------------------------------------------------ ## Method `Rule$new` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Create a rule object with 5 events allowed each second (default time) r2 <- Rule$new(5L) # Create a rule object with 5 events allowed each 3 seconds r3 <- Rule$new(5L, 3) ## ------------------------------------------------ ## Method `Rule$getN` ## ------------------------------------------------ r <- Rule$new() #' Get the allowed number of events for a rule print(r$getN()) ## ------------------------------------------------ ## Method `Rule$getLapTime` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() #' Get the configured lap time for a rule print(r$getLapTime()) ## ------------------------------------------------ ## Method `Rule$print` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Print information about a rule object print(r) ## ------------------------------------------------ ## Method `Rule$wait` ## ------------------------------------------------ # Create a rule object that allows 3 events each 0.02 seconds r <- Rule$new(3, 0.02) #' Loop for generating 20 events i <- 0 # event index while (i < 20) { # Wait until next event is allowed wait_time <- r$wait() print(paste("We have waited", wait_time, "second(s) and are now allowed to process event number", i)) i <- i + 1 }# Create a new Rule object: rule <- sched::Rule$new(n=1,lap=0.2) # 1 event allowed each 2 seconds # Wait to be allowed to process with first event: rule$wait() # The first event will be allowed directly, no waiting time. # Process your first event here rule$wait() # The second event will be delayed 0.2 seconds. This time # includes the time passed between the first call to wait() and # this one. # Process your second event here ## ------------------------------------------------ ## Method `Rule$new` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Create a rule object with 5 events allowed each second (default time) r2 <- Rule$new(5L) # Create a rule object with 5 events allowed each 3 seconds r3 <- Rule$new(5L, 3) ## ------------------------------------------------ ## Method `Rule$getN` ## ------------------------------------------------ r <- Rule$new() #' Get the allowed number of events for a rule print(r$getN()) ## ------------------------------------------------ ## Method `Rule$getLapTime` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() #' Get the configured lap time for a rule print(r$getLapTime()) ## ------------------------------------------------ ## Method `Rule$print` ## ------------------------------------------------ # Create a rule object with default parameters r <- Rule$new() # Print information about a rule object print(r) ## ------------------------------------------------ ## Method `Rule$wait` ## ------------------------------------------------ # Create a rule object that allows 3 events each 0.02 seconds r <- Rule$new(3, 0.02) #' Loop for generating 20 events i <- 0 # event index while (i < 20) { # Wait until next event is allowed wait_time <- r$wait() print(paste("We have waited", wait_time, "second(s) and are now allowed to process event number", i)) i <- i + 1 }
Class for scheduling web requests.
Class for scheduling web requests.
The Scheduler class controls the frequency of access to web sites, through
the definiton of access rules (Rule class).
It handles GET and POST requests, as well as file downloading.
It can use a cache system to store request results and avoid resending
identical requests.
new()
New instance initializer.
There should be only one Scheduler instance in an application. There is no sense in having two or more instances, since they will ignore each other and break the access frequency rules when they contact the same sites.
Scheduler$new(
default_rule = Rule$new(),
ssl_verifypeer = TRUE,
nb_max_tries = 10L,
cache_dir = tools::R_user_dir("sched", which = "cache"),
user_agent = NULL,
dwnld_timeout = 3600
)default_ruleThe default_rule to use when none has been defined for a site.
ssl_verifypeerIf set to TRUE (default), SSL certificate will be checked, otherwise certificates will be ignored.
nb_max_triesMaximum number of tries when running a request.
cache_dirSet the path to the file system cache. Set to NULL to disable the cache system. The cache system will save downloaded content and reuse it later for identical requests.
user_agentThe application name and contact address to send to the contacted web server.
dwnld_timeoutThe timeout used by downloadFile() method, in
seconds.
Nothing.
# Create a scheduler instance with a custom default_rule
scheduler <- sched::Scheduler$new(default_rule=sched::Rule$new(10, 1),
cache_dir = NULL)
setRule()
Defines a rule for a site.
Defines a rule for a site. The site is identified by its hostname. Each time a request will be made to this host (i.e.: the URL contains the defined hostname), the scheduling rule will be applied in order to wait (sleep) if nedeed before sending the request.
If a rule already exists for this hostname, it will be replaced.
Scheduler$setRule(host, n = 3L, lap = 1)
hostThe hostname of the site.
nNumber of events during a time lap.
lapDuration of a time lap, in seconds.
Nothing.
# Create a scheduler instance
scheduler <- sched::Scheduler$new(cache_dir = NULL)
# Define a rule with default values
scheduler$setRule('www.ebi.ac.uk')
# Define a rule with custome values
scheduler$setRule('my.other.site', n=10, lap=3)
sendRequest()
Sends a request, and retrieves content result.
Scheduler$sendRequest(request, cache_read = TRUE)
requestA sched::Request instance.
cache_readIf set to TRUE and the cache system is enabled, the cache system will be searched for the request and the cached result returned. In any case, if the the cache system is enabled, and the request sent, the retrieved content will be stored into the cache.
The results returned by the contacted server, as a single string value.
# Create a scheduler instance
scheduler <- sched::Scheduler$new(cache_dir = NULL)
# Define a scheduling rule of 7 requests every 2 seconds
scheduler$setRule('www.ebi.ac.uk', n=7, lap=2)
# Create a request object
u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity'
url <- sched::URL$new(url=u, params=c(chebiId=15440))
request <- sched::Request$new(url)
# Send the request and get the content result
content <- scheduler$sendRequest(request)
downloadFile()
Downloads the content of a URL and save it into the specified destination file.
This method works for any URL, even if it has been written with heavy
files in mind.
Since it uses utils::download.file() which saves the content
directly on disk, the cache system is not used.
Scheduler$downloadFile(url, dest_file, quiet = FALSE, timeout = NULL)
urlThe URL to access, as a sched::URL object.
dest_fileA path to a destination file.
quietThe quiet parameter for utils::download.file().
timeoutThe timeout in seconds. Defaults to value provided in initializer.
Nothing.
# Create a scheduler instance
scheduler <- sched::Scheduler$new(cache_dir = NULL)
# Create a temporary directory
tmp_dir <- tempdir()
# Download a file
u <- sched::URL$new(
'https://gitlab.com/cnrgh/databases/r-sched/-/raw/main/README.md',
c(ref_type='heads'))
scheduler$downloadFile(u, file.path(tmp_dir, 'README.md'))
# Remove the temporary directory
unlink(tmp_dir, recursive = TRUE)
getUrlString()
Builds a URL string, using a base URL and parameters to be passed.
The provided base URL and parameters are combined into a full URL string.
DEPRECATED. Use the sched::URL class and its method
toString() instead.
Scheduler$getUrlString(url, params = list())
urlA URL string.
paramsA list of URL parameters.
The full URL string as a single character value.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a URL string url.str <- scheduler$getUrlString( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))
getUrl()
Sends a request and get the result.
DEPRECATED. Use method sendRequest() instead.
Scheduler$getUrl(
url,
params = list(),
method = c("get", "post"),
header = NULL,
body = NULL,
encoding = NULL
)urlA URL string.
paramsA list of URL parameters.
methodThe method to use. Either 'get' or 'post'.
headerThe header to send.
bodyThe body to send.
encodingThe encoding to use.
The results of the request.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Send request content <- scheduler$getUrl( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440))
deleteRules()
Removes all defined rules, including the ones automatically defined using default_rule.
Scheduler$deleteRules()
Nothing.
# Create a scheduler instance
scheduler <- sched::Scheduler$new(cache_dir = NULL)
# Define a rule with custome values
scheduler$setRule('my.other.site', n=10, lap=3)
# Delete all defined rules
scheduler$deleteRules()
getNbRules()
Gets the number of defined rules, including the ones automatically defined using default_rule.
Scheduler$getNbRules()
The number of rules defined.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Get the number of defined rules print(scheduler$getNbRules())
setOffline()
Enables or disables offline mode.
If the offline mode is enabled, an error will be raised when the class attemps to send a request. This mode is mainly useful when debugging the usage of the cache system.
Scheduler$setOffline(offline)
offlineSet to TRUE to enable offline mode, and FALSE otherwise.
Nothing.
# Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Enable offline mode scheduler$setOffline(TRUE)
isOffline()
Tests if offline mode is enabled.
Scheduler$isOffline()
TRUE is offline mode is enabled, FALSE otherwise.
# Create a scheduler instance
scheduler <- sched::Scheduler$new(cache_dir = NULL)
# Test if offline mode is enabled
if (scheduler$isOffline())
print("Scheduler is offline.")
clone()
The objects of this class are cloneable with this method.
Scheduler$clone(deep = FALSE)
deepWhether to make a deep clone.
# Create a scheduler instance without cache scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$new` ## ------------------------------------------------ # Create a scheduler instance with a custom default_rule scheduler <- sched::Scheduler$new(default_rule=sched::Rule$new(10, 1), cache_dir = NULL) ## ------------------------------------------------ ## Method `Scheduler$setRule` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) ## ------------------------------------------------ ## Method `Scheduler$sendRequest` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a scheduling rule of 7 requests every 2 seconds scheduler$setRule('www.ebi.ac.uk', n=7, lap=2) # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$downloadFile` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a temporary directory tmp_dir <- tempdir() # Download a file u <- sched::URL$new( 'https://gitlab.com/cnrgh/databases/r-sched/-/raw/main/README.md', c(ref_type='heads')) scheduler$downloadFile(u, file.path(tmp_dir, 'README.md')) # Remove the temporary directory unlink(tmp_dir, recursive = TRUE) ## ------------------------------------------------ ## Method `Scheduler$getUrlString` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a URL string url.str <- scheduler$getUrlString( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$getUrl` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Send request content <- scheduler$getUrl( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$deleteRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) # Delete all defined rules scheduler$deleteRules() ## ------------------------------------------------ ## Method `Scheduler$getNbRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Get the number of defined rules print(scheduler$getNbRules()) ## ------------------------------------------------ ## Method `Scheduler$setOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Enable offline mode scheduler$setOffline(TRUE) ## ------------------------------------------------ ## Method `Scheduler$isOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Test if offline mode is enabled if (scheduler$isOffline()) print("Scheduler is offline.")# Create a scheduler instance without cache scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$new` ## ------------------------------------------------ # Create a scheduler instance with a custom default_rule scheduler <- sched::Scheduler$new(default_rule=sched::Rule$new(10, 1), cache_dir = NULL) ## ------------------------------------------------ ## Method `Scheduler$setRule` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with default values scheduler$setRule('www.ebi.ac.uk') # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) ## ------------------------------------------------ ## Method `Scheduler$sendRequest` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a scheduling rule of 7 requests every 2 seconds scheduler$setRule('www.ebi.ac.uk', n=7, lap=2) # Create a request object u <- 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity' url <- sched::URL$new(url=u, params=c(chebiId=15440)) request <- sched::Request$new(url) # Send the request and get the content result content <- scheduler$sendRequest(request) ## ------------------------------------------------ ## Method `Scheduler$downloadFile` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a temporary directory tmp_dir <- tempdir() # Download a file u <- sched::URL$new( 'https://gitlab.com/cnrgh/databases/r-sched/-/raw/main/README.md', c(ref_type='heads')) scheduler$downloadFile(u, file.path(tmp_dir, 'README.md')) # Remove the temporary directory unlink(tmp_dir, recursive = TRUE) ## ------------------------------------------------ ## Method `Scheduler$getUrlString` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Create a URL string url.str <- scheduler$getUrlString( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$getUrl` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Send request content <- scheduler$getUrl( 'https://www.ebi.ac.uk/webservices/chebi/2.0/test/getCompleteEntity', params=c(chebiId=15440)) ## ------------------------------------------------ ## Method `Scheduler$deleteRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Define a rule with custome values scheduler$setRule('my.other.site', n=10, lap=3) # Delete all defined rules scheduler$deleteRules() ## ------------------------------------------------ ## Method `Scheduler$getNbRules` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Get the number of defined rules print(scheduler$getNbRules()) ## ------------------------------------------------ ## Method `Scheduler$setOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Enable offline mode scheduler$setOffline(TRUE) ## ------------------------------------------------ ## Method `Scheduler$isOffline` ## ------------------------------------------------ # Create a scheduler instance scheduler <- sched::Scheduler$new(cache_dir = NULL) # Test if offline mode is enabled if (scheduler$isOffline()) print("Scheduler is offline.")
URL class.
URL class.
This class represents a URL object that can be used in requests. It handles parameters as a list, making it easy to build URLs for contacting web services.
new()
Initializer.
URL$new(url = character(), params = character(), chomp_extra_slashes = TRUE)
urlThe URL to access, as a character vector.
paramsThe list of parameters to append to this URL. If it is an
unnamed list or vector, the values will be converted to strings and
concatenated with the & separator. If it is a named list or vector, the
names will be used as keys as in "name1=value1&name2=value2&...".
chomp_extra_slashesIf set to TRUE, then slashes at the end and the beginning of each element of the url vector parameter will be removed before proper concatenation.
Nothing.
# Create a URL object
url <- sched::URL$new("https://www.my.server/", c(param1=12,
param2='abc'))
getDomain()
Etracts the domain name from the URL.
URL$getDomain()
The domain.
# Create a URL object
url <- sched::URL$new("https://www.my.server/",
c(param1=12, param2='abc'))
# Extract the domain name
print(url$getDomain())
setUrl()
Sets the base URL string.
URL$setUrl(url)
urlThe base URL string.
Nothing.
# Create an empty URL object
url <- sched::URL$new()
# Set the URL
url$setUrl('https://www.my.server/')
# Convert the URL to a string
print(url$toString())
setParam()
Sets a parameter.
URL$setParam(key, value)
keyThe parameter name.
valueThe value of the parameter.
Nothing.
# Create an URL object
url <- sched::URL$new('https://www.my.server/')
# Set a parameter
url$setParam('a', 12)
# Convert the URL to a string
print(url$toString())
print()
Displays information about this instance.
URL$print()
self as invisible.
# Create an URL object
url <- sched::URL$new('https://www.my.server/')
# Print the URL object
print(url)
toString()
Gets the URL as a string representation.
URL$toString(encode = TRUE)
encodeIf set to TRUE, then encodes the URL.
The URL as a string, with all parameters and values set.
# Create an URL object
url <- sched::URL$new('https://www.my.server/', c(a=12))
# Convert the URL to a string
print(url$toString())
clone()
The objects of this class are cloneable with this method.
URL$clone(deep = FALSE)
deepWhether to make a deep clone.
# Create a URL object from a base URL string and a list of parameters base.url <- c("https://www.uniprot.org", "uniprot") params <- c(query="reviewed:yes+AND+organism:9606", columns='id,entry name,protein names', format="tab") url <- sched::URL$new(url=base.url, params=params) # Print the URL converted to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$new` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) ## ------------------------------------------------ ## Method `URL$getDomain` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) # Extract the domain name print(url$getDomain()) ## ------------------------------------------------ ## Method `URL$setUrl` ## ------------------------------------------------ # Create an empty URL object url <- sched::URL$new() # Set the URL url$setUrl('https://www.my.server/') # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$setParam` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Set a parameter url$setParam('a', 12) # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$print` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Print the URL object print(url) ## ------------------------------------------------ ## Method `URL$toString` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/', c(a=12)) # Convert the URL to a string print(url$toString())# Create a URL object from a base URL string and a list of parameters base.url <- c("https://www.uniprot.org", "uniprot") params <- c(query="reviewed:yes+AND+organism:9606", columns='id,entry name,protein names', format="tab") url <- sched::URL$new(url=base.url, params=params) # Print the URL converted to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$new` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) ## ------------------------------------------------ ## Method `URL$getDomain` ## ------------------------------------------------ # Create a URL object url <- sched::URL$new("https://www.my.server/", c(param1=12, param2='abc')) # Extract the domain name print(url$getDomain()) ## ------------------------------------------------ ## Method `URL$setUrl` ## ------------------------------------------------ # Create an empty URL object url <- sched::URL$new() # Set the URL url$setUrl('https://www.my.server/') # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$setParam` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Set a parameter url$setParam('a', 12) # Convert the URL to a string print(url$toString()) ## ------------------------------------------------ ## Method `URL$print` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/') # Print the URL object print(url) ## ------------------------------------------------ ## Method `URL$toString` ## ------------------------------------------------ # Create an URL object url <- sched::URL$new('https://www.my.server/', c(a=12)) # Convert the URL to a string print(url$toString())