Global web icon
dask.org
https://docs.dask.org/en/stable/dataframe.html
Dask DataFrame — Dask documentation
A Dask DataFrame is a large parallel DataFrame composed of many smaller pandas DataFrames, split along the index. These pandas DataFrames may live on disk for larger-than-memory computing on a single machine, or on many different machines in a cluster.
Global web icon
dask.org
https://tutorial.dask.org/01_dataframe.html
Dask DataFrame - parallelized pandas
At its core, the dask.dataframe module implements a “blocked parallel” DataFrame object that looks and feels like the pandas API, but for parallel and distributed workflows.
Global web icon
dask.org
https://examples.dask.org/dataframe.html
Dask DataFrames — Dask Examples documentation
Most common Pandas operations can be used in the same way on Dask dataframes. This example shows how to slice the data based on a mask condition and then determine the standard deviation of the data in the x column.
Global web icon
dask.org
https://docs.dask.org/en/stable/generated/dask.dat…
dask.dataframe.DataFrame — Dask documentation
dask.dataframe.DataFrame # class dask.dataframe.DataFrame(expr) [source] # DataFrame-like Expr Collection. The constructor takes the expression that represents the query as input. The class is not meant to be instantiated directly. Instead, use one of the IO connectors from Dask. __init__(expr) # Methods ... Attributes
Global web icon
dask.org
https://docs.dask.org/en/stable/dataframe-api.html
Dask DataFrame API with Logical Query Planning
Similar to pandas, Dask provides dtype-specific methods under various accessors. These are separate namespaces within Series that only apply to specific data types.
Global web icon
dask.org
https://docs.dask.org/en/stable/dataframe-create.h…
Load and Save Data with Dask DataFrames
Learn how to create DataFrames and store them. Create a Dask DataFrame from various data storage formats like CSV, HDF, Apache Parquet, and others.
Global web icon
dask.org
https://examples.dask.org/dataframes/01-data-acces…
DataFrames: Read and Write Data — Dask Examples documentation
Dask Dataframes can read and store data in many of the same formats as Pandas dataframes. In this example we read and write data with the popular CSV and Parquet formats, and discuss best practices when using these formats.
Global web icon
dask.org
https://docs.dask.org/en/stable/generated/dask.dat…
dask.dataframe.from_pandas
This splits an in-memory Pandas dataframe into several parts and constructs a dask.dataframe from those parts on which Dask.dataframe can operate in parallel. By default, the input dataframe will be sorted by the index to produce cleanly-divided partitions (with known divisions).
Global web icon
dask.org
https://docs.dask.org/en/stable/dataframe-indexing…
Indexing into Dask DataFrames
In addition to pandas-style indexing, Dask DataFrame also supports indexing at a partition level with DataFrame.get_partition() and DataFrame.partitions. These can be used to select subsets of the data by partition, rather than by position in the entire DataFrame or index label.
Global web icon
dask.org
https://docs.dask.org/en/stable/index.html
Dask — Dask documentation
Dask use is widespread, across all industries and scales. Dask is used anywhere Python is used and people experience pain due to large scale data, or intense computing.