About the Kiji Project
What is Kiji?
Kiji is an Apache 2.0 Licensed, open-source framework for storing and serving user data to enable real-time personalization as users interact across channels. Kiji allows for batch model training and real-time model scoring, ensuring the user experience is adapted with each interaction.
Kiji is developed on the Hadoop ecosystem, using HBase as its original underlying data store. With Kiji, developers can create a flexible and comprehensive entity-centric schema that enables a 360 degree view of each customer. Data is stored in a rich, compressed, binary Avro format allowing the application to support complex data types. Kiji handles all aspects of serialization and deserialization while maintaining schema metadata, ensuring backwards compatibility as an application’s schema evolves, and captures real-time application interactions.
Most organizations are collecting and storing data in a distributed file systems, such as HDFS, and key value stores, such as HBase, in order to better serve individual customers. However, these tools are very raw, difficult to use, and have no underlying framework for integration. Kiji is the middleware necessary to ingest detail data, stream real-time data, build predictive models and deploy those models on the fly. The various components of Kiji address the common use cases and solve the common challenges experienced by: developers, engineers, analysts and data scientists.
The Kiji Project is modularized into separate components to support a wide range of usage and encourage clean separation of functionality. The Bento Box contains all Kiji modules assembled in a self-contained download. Each module can also be individually downloaded on GitHub.
KijiSchema: simplifies real-time storage and retrieval of diverse data from primitive types to objects, time-series and event streams. KijiSchema handles challenges with serialization, schema design and evolution, and meta data management common in NoSQL storage solutions.
KijiSchema DDL Shell: provides a Data Definition Language that allows for the creation, inspection, and modification of schemas for KijiSchema.
KijiMapReduce: provides a powerful paradigm to apply MapReduce in both batch and real-time workloads. KijiMapReduce introduces producers to perform record-wise analytics and gatherers, which build predictive models by analyzing aggregate behaviors.
KijiMapReduce Library: is a library of helpful examples and useful implementations of MapReduce jobs that can be created within Kiji.
Kiji Hive Adapter: provides HiveQL access to Kiji data through a familiar SQL shell.
Kiji Model Repository: is a library of machine learning tools built on top of KijiExpress.
KijiREST: provides an HTTP REST API for front-end developers to access Kiji data and to trigger model scoring.
KijiScoring: provides the real-time scoring of predictive models within your application.
Get Involved in the Kiji Community
Download the Kiji source code and start hacking on Kiji!
Find bugs or suggest a new feature.
Learn more about contributing to Kiji Project.
Join a Mailing ListDeveloper Mailing List User Mailing List Issues Mailing List
Download the Kiji Bento Box
Download Bento Box
Ebi Bento Box 2.3.0Download BentoBox 2.3.0
The Bento Box contains all Kiji modules assembled in a self-contained download. The Bento Box comes with a standalone Hadoop and HBase cluster and command-line tools for interacting with Kiji tables and data.
Try a Tutorial
Kiji Music Recommender
Build your own recommendation system:
- Install and create Kiji tables
- Import example meta data into tables of songs and user’s listening history
- Run a MapReduce job: Recommend the next song a user should listen to