Title: | Translate 'SQL' Queries into 'R' Expressions |
---|---|
Description: | Translate 'SQL' 'SELECT' statements into lists of 'R' expressions. |
Authors: | Ian Cook [aut, cre], Cloudera [cph] |
Maintainer: | Ian Cook <[email protected]> |
License: | Apache License 2.0 |
Version: | 0.3.2.9000 |
Built: | 2024-10-31 20:26:27 UTC |
Source: | https://github.com/ianmcook/queryparser |
Returns a character vector containing all the column references
in the clauses of a parsed SQL SELECT
statement
column_references(tree, from = TRUE)
column_references(tree, from = TRUE)
tree |
a list returned by |
from |
a logical value indicating whether to include the column
references from the join conditions in the |
The returned character vector includes only column
references, not table references. Column aliases assigned in the
SELECT
list are not included unless they are used in other clauses.
A character vector containing all the unique column references found
in the SELECT
, FROM
(if from = TRUE
), WHERE
,
GROUP BY
, HAVING
, and ORDER BY
clauses of the
SELECT
statement
my_query <- "SELECT f.flight, manufacturer, p.model FROM flights f JOIN planes p USING (tailnum);" column_references(parse_query(my_query), from = FALSE)
my_query <- "SELECT f.flight, manufacturer, p.model FROM flights f JOIN planes p USING (tailnum);" column_references(parse_query(my_query), from = FALSE)
Extracts the column alias assignment from an expression used in
the SELECT
list of a SQL query
extract_alias(expr)
extract_alias(expr)
expr |
a character string containing a SQL expression which might have a column alias assignment at the end |
The expression must not contain any unquoted whitespace characters
except spaces, and there must be no unquoted runs or two or more spaces.
Use squish_sql
to satisfy this whitespace requirement.
queryparser also uses this function internally to extract table aliases
used in the FROM
clause.
a character string containing the inputted SQL expression with the column alias assignment removed (if it existed) and with the assigned alias as its name
expr <- "round(AVG(arr_delay)) AS avg_delay" extract_alias(expr)
expr <- "round(AVG(arr_delay)) AS avg_delay" extract_alias(expr)
Parses a SQL expression into an R expression
parse_expression(expr, tidyverse = FALSE, secure = TRUE)
parse_expression(expr, tidyverse = FALSE, secure = TRUE)
expr |
a character string containing a SQL expression |
tidyverse |
set to |
secure |
set to |
The expression must not end with a column alias assignment.
Use extract_alias
to extract and remove column alias
assignments.
The expression must not contain any unquoted whitespace characters except
spaces, and there must be no unquoted runs or two or more spaces. The
expression must not contain line comments (--
) or block comments
(/* */
). Use squish_sql
to satisfy these whitespace
requirements and remove any comments.
an unevaluated R expression (a call
)
expr <- "round(AVG(arr_delay))" parse_expression(expr)
expr <- "round(AVG(arr_delay))" parse_expression(expr)
Parses a SQL SELECT
statement into a list with R
expressions
parse_query(query, tidyverse = FALSE, secure = TRUE)
parse_query(query, tidyverse = FALSE, secure = TRUE)
query |
a character string containing a SQL |
tidyverse |
set to |
secure |
set to |
See the
current
limitations section of the README
for information about what types
of queries are supported.
A list object with named elements representing the clauses of the query, containing sublists of unevaluated R expressions translated from the SQL expressions in the query.
Depending on the arguments, the returned list and its sublists will have
attributes named distinct
and aggregate
with logical values
that can aid in the evaluation of the R expressions. If query
contains one or more joins, then the sublist named from
will have
attributes named join_types
and join_conditions
specifying
the types of join and the join conditions.
my_query <- "SELECT origin, dest, COUNT(flight) AS num_flts, round(AVG(distance)) AS dist, round(AVG(arr_delay)) AS avg_delay FROM flights WHERE distance BETWEEN 200 AND 300 AND air_time IS NOT NULL GROUP BY origin, dest HAVING num_flts > 3000 ORDER BY num_flts DESC, avg_delay DESC LIMIT 100;" parse_query(my_query) parse_query(my_query, tidyverse = TRUE)
my_query <- "SELECT origin, dest, COUNT(flight) AS num_flts, round(AVG(distance)) AS dist, round(AVG(arr_delay)) AS avg_delay FROM flights WHERE distance BETWEEN 200 AND 300 AND air_time IS NOT NULL GROUP BY origin, dest HAVING num_flts > 3000 ORDER BY num_flts DESC, avg_delay DESC LIMIT 100;" parse_query(my_query) parse_query(my_query, tidyverse = TRUE)
Splits a SQL SELECT
statement into clauses, and splits
comma-separated column lists within the clauses.
split_query(query, tidyverse)
split_query(query, tidyverse)
query |
a character string containing a SQL |
tidyverse |
for queryparser internal use only |
A list object with named elements representing the clauses of the query
my_query <- "SELECT origin, dest, COUNT(flight) AS num_flts, round(AVG(distance)) AS dist, round(AVG(arr_delay)) AS avg_delay FROM flights WHERE distance BETWEEN 200 AND 300 AND air_time IS NOT NULL GROUP BY origin, dest HAVING num_flts > 3000 ORDER BY num_flts DESC, avg_delay DESC LIMIT 100;" split_query(my_query)
my_query <- "SELECT origin, dest, COUNT(flight) AS num_flts, round(AVG(distance)) AS dist, round(AVG(arr_delay)) AS avg_delay FROM flights WHERE distance BETWEEN 200 AND 300 AND air_time IS NOT NULL GROUP BY origin, dest HAVING num_flts > 3000 ORDER BY num_flts DESC, avg_delay DESC LIMIT 100;" split_query(my_query)
Replaces every unquoted run of whitespace characters with a
single space and removes all line comments (--
) and block comments
(/* */
). Whitespace and comment marks within quotes are not
modified.
squish_sql(x)
squish_sql(x)
x |
a character string containing a SQL query or expression |
a character string containing the squished query or expression with comments removed
Unqualifies column references in the clauses of a parsed SQL
SELECT
statement that begin with any of the specified prefixes
followed by a dot
unqualify_query(tree, prefixes, except = character(0))
unqualify_query(tree, prefixes, except = character(0))
tree |
a list returned by |
prefixes |
a character vector containing one or more table names or table aliases |
except |
a character vector containing column references to leave as is (optional) |
In the returned list, the FROM
clause is unmodified and
column alias assignments made in the SELECT
clause are unmodified.
A list the same as tree
but with all column references in the
SELECT
, WHERE
, GROUP BY
, HAVING
, and
ORDER BY
clauses unqualified, except those in except
my_query <- "SELECT f.flight, manufacturer, p.model FROM flights f JOIN planes p USING (tailnum);" unqualify_query( parse_query(my_query), prefixes = c("p", "f") )
my_query <- "SELECT f.flight, manufacturer, p.model FROM flights f JOIN planes p USING (tailnum);" unqualify_query( parse_query(my_query), prefixes = c("p", "f") )