These functions are described in detail below
Partial functional SQL Support
SQL Our support can provide users with great convenience , If you go to see the brick counting Delta Lake product , You must have seen it support SQL grammar . But open source Delta Lake Prior to this release, only Scala\Java To create 、 Delete 、 to update Delta Lake surface .
The good news is that , from 0.4.0 Version start ,Delta Lake Has started to support some of the commands SQL The grammar . because Delta Lake It's a separate project , If it needs to support all SQL grammar , Need from Apache Spark Copy a lot of code to Delta Lake In the project , Not easy to maintain , So this version only supports vacuum and history Simple command SQL grammar .
Other delete、update as well as merge Of DML Operational support may have to wait until Spark 3.0 edition To support . Now the community is also Spark 3.0 Inside DataSource V2 API It's added a pair of DELETE/UPDATE/MERGE Support for , For details, see https://issues.apache.org/jira/browse/SPARK-28303. Believe in the future version , These are basic SQL Grammar support will gradually support .
be used for DML And practical operation Python API
stay 0.4.0 Before the release ,Delta Lake Only support Scala and Java API. In order to be able to Python Use in Delta Lake, This version introduces Python API（ For details, please refer to https://github.com/delta-io/delta/issues/89）, You can use it in Delta Lake Table update\delete\merge Wait for the operation .
We can also use this Python API Run some practical operations , such as vacuum、history etc. . such Python and Scala\Java Of API The function is aligned . More about Python API For the use of Delta Lake Official documents of .
take Parquet Table to Delta Lake surface
If we had one parquet My ordinary watch , And then we want to turn it into Delta Lake surface , Before that, we need to read out this table , And then write Delta Lake surface . If our parquet The watch is very big , It takes a lot of resources to transform . This version gives us conversion commands , direct You can put... In place Parquet surface convert to Delta Lake surface , Be careful It's in place , It means you don't need to Moving data from one place to another place , Don't need to, Read and write all the data Original catalogue . This command Will list Parquet All the files in the table , then adopt Automatically Read all Parquet Of documents footer Get the location of the table Pattern , And then finally generate a transaction log To track these files . When however , If you don't need to Delta Lake surface , You can also use this Order it back to normal Parquet surface .