The core of RAPIDS is CUDA DataFrame (cuDF), a library that provides Pandas-like DataFrame (a columnar data structure) functionality with GPU acceleration. cuDF provides a Python interface for use in existing data science workflows, and underneath cuDF is libcuDF, an open-source CUDA C++ library that provides a column data structure and algorithms to operate on these columns, such as filtering, selection, sorting, joining, and groupby. In this talk you will learn about some of the C++ and CUDA internals of libcuDF. This talk will cover how we perform run-time type dispatch on type-erased data structures to enable operating on a variety of data types and interface with dynamic languages like Python. We’ll describe how and why we built a pool allocator for CUDA device memory to massively improve performance on multi-GPU systems. And we’ll dive into GPU algorithms we use for multi-column database operations like groupby and join. If you are interested in using GPU DataFrames via libcuDF’s C/C++ interface, or if you are interested in contributing to the cuDF / libcuDF open source project, then this talk is for you.