cuDF(https://github.com/rapidsai/cudf) It's based on Python Of GPU DataFrame library , Used to process data , Including loading 、 Connect 、 Aggregate and filter data . towards GPU The shift allows for massive acceleration , because GPU Than CPU With more kernels .
I feel , A better use scenario for me is , Instead of parallel , stay pandas When processing is slow , Switch to cuDF, There's no need to write tedious parallels .
Official documents :
1 Docs » API Reference
2 rapidsai/cudf
Relevant reference :
nvidia-rapids︱cuDF And pandas Same DataFrame library
NVIDIA Of python-GPU Algorithmic ecology ︱ RAPIDS 0.10
nvidia-rapids︱cuML Machine learning acceleration Library
nvidia-rapids︱cuGraph(NetworkX-like) Graph model
cuDF The development rate in the past year is very fast . Each version has exciting new features 、 Optimization and bug fix .0.10 Version is no exception .cuDF 0.10 Some of the new features in the version include groupby.quantile()、Series.isin()、 From remote / Cloud file system ( for example hdfs、gcs、s3) Read 、Series and DataFrame isna()、 Press any length in the group function Series grouping 、Series Covariance and Pearson Relevance and from DataFrame / Series .values Property returns CuPy Array . Besides ,apply UDF function API Optimized , And joined through .iloc Collection and dissemination methods of accessors .
In addition to providing all of the above excellent features 、 Beyond optimization and bug fixes ,cuDF 0.10 It also takes a lot of effort to build the future . This version will cuStrings Repository merge into cuDF in , And is ready to merge two code bases , Enables string functionality to be more tightly integrated into cuDF in , To provide faster acceleration and more functions . Besides ,RAPIDS Added cuStreamz Meta package , So you can use cuDF and Streamz Library simplification GPU Accelerate flow processing .cuDF Keep improving Pandas API Compatibility and Dask DataFrame Interoperability , So that our users can maximize the seamless use of cuDF.
Behind the scenes ,libcudf Our internal architecture is undergoing a major redesign .0.10 The latest version of cudf :: column and cudf :: table class , These classes greatly improve the robustness of memory ownership control , And support variable size data types for the future ( Include string Columns 、 Arrays and structures ) Laid the foundation . As has been built on the whole libcudf API Support for new classes in , This work will continue in the next release cycle . Besides ,libcudf 0.10 A lot of new API Sum algorithm , Including sort based 、 Support the grouping function of empty data 、 Grouping function quantile and median 、cudf :: unique_count,cudf :: repeat、cudf :: scatter_to_tables etc. . As usual , This release also includes many other improvements and fixes .
RAPIDS Memory manager Library RMM There is also a series of restructuring going on . This reorganization includes a new architecture based on memory resources , The architecture and C ++ 17 std :: pmr :: memory_resource Mostly compatible . This makes it easier for the library to add a new type of memory allocator after the common interface .0.10 Also use Cython To replace the CFFI Python binding , So that C ++ Exceptions can be propagated to Python abnormal , Make more tunable errors passed to the application . The next version will continue to improve RMM Exception support in .
Last , You'll notice cuDF There's been a significant increase in speed in this release , Include join( most 11 times )、gather and scatter on tables( Too fast 2-3 times ) Significant performance improvements for , And more like the picture 5 What is shown .
chart 5: Single NVIDIA Tesla V100( Try it for free now ) GPU And two ways Intel Xeon E5–2698 v4 CPU(20 nucleus ) Upper cuDF vs Pandas Speed up
Yes conda It can be installed directly , You can also use docker, Reference resources :https://github.com/rapidsai/cudf
conda edition ,cudf version == 0.10
# for CUDA 9.2
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
cudf=0.10 python=3.6 cudatoolkit=9.2
# or, for CUDA 10.0
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
cudf=0.10 python=3.6 cudatoolkit=10.0
# or, for CUDA 10.1
conda install -c rapidsai -c nvidia -c numba -c conda-forge \
cudf=0.10 python=3.6 cudatoolkit=10.1
docker edition , May refer to :https://rapids.ai/start.html#prerequisites
docker pull rapidsai/rapidsai:cuda10.1-runtime-ubuntu16.04-py3.7
docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
rapidsai/rapidsai:cuda10.1-runtime-ubuntu16.04-py3.7
import cudf
import numpy as np
from datetime import datetime, timedelta
t0 = datetime.strptime('2018-10-07 12:00:00', '%Y-%m-%d %H:%M:%S')
n = 5
df = cudf.DataFrame({
'id': np.arange(n),
'datetimes': np.array([(t0+ timedelta(seconds=x)) for x in range(n)])
})
df
Build DataFrame via list of rows as tuples:
>>> import cudf
>>> df = cudf.DataFrame([
(5, "cats", "jump", np.nan),
(2, "dogs", "dig", 7.5),
(3, "cows", "moo", -2.1, "occasionally"),
])
>>> df
0 1 2 3 4
0 5 cats jump null None
1 2 dogs dig 7.5 None
2 3 cows moo -2.1 occasionally
pandas To cuDF
>>> import pandas as pd
>>> import cudf
>>> pdf = pd.DataFrame({'a': [0, 1, 2, 3],'b': [0.1, 0.2, None, 0.3]})
>>> df = cudf.from_pandas(pdf)
>>> df
a b
0 0 0.1
1 1 0.2
2 2 nan
3 3 0.3
cuDF To pandas
>>> import cudf
>>> gdf = cudf.DataFrame({
'a': [1, 2, None], 'b': [3, None, 5]})
>>> gdf.fillna(4).to_pandas()
a b
0 1 3
1 2 4
2 4 5
>>> gdf.fillna({
'a': 3, 'b': 4}).to_pandas()
a b
0 1 3
1 2 4
2 3 5
df = cudf.DataFrame({
'a': list(range(20)),
'b': list(range(20)),
'c': list(range(20))})
df
df.iloc[1]
a 1
b 1
c 1
Name: 1, dtype: int64
apply_rows
import cudf
import numpy as np
from numba import cuda
df = cudf.DataFrame()
df['in1'] = np.arange(1000, dtype=np.float64)
def kernel(in1, out):
for i, x in enumerate(in1):
print('tid:', cuda.threadIdx.x, 'bid:', cuda.blockIdx.x,
'array size:', in1.size, 'block threads:', cuda.blockDim.x)
out[i] = x * 2.0
outdf = df.apply_rows(kernel,
incols=['in1'],
outcols=dict(out=np.float64),
kwargs=dict())
print(outdf['in1'].sum()*2.0)
print(outdf['out'].sum())
>>> 999000.0
>>> 999000.0
apply_chunks
import cudf
import numpy as np
from numba import cuda
df = cudf.DataFrame()
df['in1'] = np.arange(100, dtype=np.float64)
def kernel(in1, out):
print('tid:', cuda.threadIdx.x, 'bid:', cuda.blockIdx.x,
'array size:', in1.size, 'block threads:', cuda.blockDim.x)
for i in range(cuda.threadIdx.x, in1.size, cuda.blockDim.x):
out[i] = in1[i] * 2.0
outdf = df.apply_chunks(kernel,
incols=['in1'],
outcols=dict(out=np.float64),
kwargs=dict(),
chunks=16,
tpb=8)
print(outdf['in1'].sum()*2.0)
print(outdf['out'].sum())
>>> 9900.0
>>> 9900.0
from cudf import DataFrame
df = DataFrame()
df['key'] = [0, 0, 1, 1, 2, 2, 2]
df['val'] = [0, 1, 2, 3, 4, 5, 6]
groups = df.groupby(['key'], method='cudf')
# Define a function to apply to each row in a group
def mult(df):
df['out'] = df['key'] * df['val']
return df
result = groups.apply(mult)
print(result)
Output :
key val out
0 0 0 0
1 0 1 0
2 1 2 2
3 1 3 3
4 2 4 8
5 2 5 10
6 2 6 12
after , When used, add ..