Stars
Resource for the book Trino: The Definitive Guide (and formerly Presto: The Definitive Guide)
LSH and Hypercube algorithms for Approximate Nearest Neighbor. Centroid based clustering using Lloyd's and reverse assignment algorithms.
Code release for CVPR'24 submission 'OmniGlue'
📺 Discover the latest machine learning / AI courses on YouTube.
AtroCore is an open-source Data Platform, Data Management and Master Data Management (MDM) software, which can be used to quickly create any business application.
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualiz…
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Ambari stack service for installing and managing Apache Airflow on HDP cluster
Install Ambari 2.7.5 with HDP 3.1.4 without using Hortonworks repositories.
🔥Highlighting the top ML papers every week.
Complete high-quality practice tests of 50 questions each will help you master your Confluent Certified Developer for Apache Kafka (CCDAK) exam: These practice exams will help you assess and ensure…
If you are planning or preparing for Apache Kafka Certification then this is the right place for you.There are many Apache Kafka Certifications are available in the market but CCDAK (Confluent Cert…
Complete high-quality practice tests of 50 questions each will help you master your Confluent Certified Developer for Apache Kafka (CCDAK) exam: These practice exams will help you assess and ensure…
Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.
A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.
High quality resources & applications for LLMs, multi-modal models and VectorDBs
Generative AI on AWS
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
ssattari / path-to-senior-engineer-handbook
Forked from jordan-cutler/path-to-senior-engineer-handbookAll the resources you need to get to Senior Engineer and beyond
Must-read papers on graph neural networks (GNN)
shuhuai007 / GNNPapers
Forked from thunlp/GNNPapersMust-read papers on graph neural networks (GNN)
Ambari stack service for easily installing and managing Hue on HDP cluster