Give AlbumentationsX a star on GitHub — it powers this leaderboard

Star on GitHub
The Apache Software Foundation

The Apache Software Foundation

View on GitHub

Packages on Leaderboard (4)

RankPackageDownloadsStarsLanguage
83pyarrow291,687,06116,545C++
422pyspark49,835,52242,908Scala
739pyiceberg22,777,3191,007
2580apache-tvm-ffi2,296,958354

Top GitHub repositories

RepositoryDescriptionStarsLanguage
apache/supersetApache Superset is a Data Visualization and Data Exploration Platform70,743TypeScript
apache/echartsApache ECharts is a powerful, interactive charting and data visualization library for browser65,825TypeScript
apache/airflowApache Airflow - A platform to programmatically author, schedule, and monitor workflows44,454Python
apache/dubboThe java implementation of Apache Dubbo. An RPC and microservice framework.41,702Java
apache/kafkaMirror of Apache Kafka32,076Java
apache/incubator-seata:fire: Seata is an easy-to-use, high-performance, open source distributed transaction solution.25,952Java
apache/flinkApache Flink25,829Java
apache/skywalkingAPM, Application Performance Monitoring System24,722Java
apache/rocketmqApache RocketMQ is a cloud native messaging and streaming platform, making it simple to build event-driven applications.22,350Java
apache/mxnetLightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more20,832C++
apache/shardingsphereEmpowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.20,697Java
apache/pouchdb:kangaroo: - PouchDB is a pocket-sized database.17,552JavaScript
apache/brpcbrpc is an Industrial-grade RPC framework using C++ Language, which is often used in high performance system such as Search, Storage, Machine learning, Advertisement, Recommendation etc. "brpc" means "better RPC".17,460C++
apache/apisixThe Cloud-Native API Gateway and AI Gateway16,259Lua
apache/hadoopApache Hadoop15,489Java
apache/answerA Q&A platform software for teams at any scales. Whether it's a community forum, help center, or knowledge management platform, you can always count on Apache Answer.15,410Go
apache/pulsarApache Pulsar - distributed pub-sub messaging system15,143Java
apache/dorisApache Doris is an easy-to-use, high performance and unified analytics database.15,063Java
apache/dolphinschedulerApache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code14,173Java
apache/druidApache Druid: a high performance real-time analytics database.13,948Java
apache/incubator-weexApache Weex (Incubating)13,661C++
apache/tvmOpen Machine Learning Compiler Framework13,151Python
apache/zookeeperApache ZooKeeper12,731Java
apache/predictionioPredictionIO, a machine learning server for developers and ML engineers.12,528Scala
apache/thriftApache Thrift10,907C++
apache/cassandraApache Cassandra®9,647Java
apache/jmeterApache JMeter open-source load testing tool for analyzing and measuring the performance of a variety of services9,242Java
apache/seatunnelSeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.9,135Java
apache/shenyuApache ShenYu is a Java native API Gateway for service proxy, protocol conversion and API governance.8,777Java
apache/icebergApache Iceberg8,578Java
apache/beamApache Beam is a unified programming model for Batch and Streaming data processing.8,499Java
apache/datafusionApache DataFusion SQL Query Engine8,463Rust
apache/shardingsphere-elasticjobDistributed scheduled job8,222Java
apache/tomcatApache Tomcat8,107Java
apache/hertzbeatAn AI-powered next-generation open source real-time observability system.7,106Java
apache/couchdbSeamless multi-primary syncing database with an intuitive HTTP/JSON API, designed for reliability6,826Erlang
apache/openwhiskApache OpenWhisk is an open source serverless cloud platform6,757Scala
apache/stormApache Storm6,671Java
apache/zeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.6,605Java
apache/flink-cdcFlink CDC is a streaming data integration tool6,361Java
apache/iotdbApache IoTDB6,287Java
apache/incubator-kie-droolsDrools is a rule engine, DMN engine and complex event processing (CEP) engine for Java6,220Java
apache/camelApache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.6,139Java
apache/hudiUpserts, Deletes And Incremental Processing on Big Data.6,103Java
apache/pinotApache Pinot - A realtime distributed OLAP datastore6,037Java
apache/hiveApache Hive6,002Java
apache/nifiApache NiFi5,985Java
apache/fesodFast. Easy. Done. Processing spreadsheets without worrying about large files causing OOM.5,836Java
apache/hbaseApache HBase5,578Java
apache/groovyApache Groovy: A powerful multi-faceted programming language for the JVM platform5,429Java
apache/dubbo-spring-boot-projectSpring Boot Project for Apache Dubbo5,400Java
apache/mesosApache Mesos5,365C++
apache/calciteApache Calcite5,080Java
apache/igniteApache Ignite5,045Java
apache/mavenApache Maven core4,971Java
apache/opendalApache OpenDAL: One Layer, All Storage.4,916Rust
apache/dubbo-goGo Implementation For Apache Dubbo .4,887Go
apache/incubator-weex-ui🏄 A rich interaction, lightweight, high performance UI library based on Weex.4,739Vue
apache/rocketmq-externalsMirror of Apache RocketMQ (Incubating)4,617Java
apache/shiroApache Shiro4,439Java
apache/lucene-solrApache Lucene and Solr open-source search software4,370
apache/incubator-pagespeed-ngxAutomatic PageSpeed optimization module for Nginx4,350C++
apache/streamparkMake stream processing easier! Easy-to-use streaming application development framework and operation platform.4,299Java
apache/ageGraph database optimized for fast analysis and real-time data processing. It is provided as an extension to PostgreSQL.4,254C
apache/kvrocksApache Kvrocks is a distributed key value NoSQL database that uses RocksDB as storage engine and is compatible with Redis protocol.4,247C++
apache/foryA blazingly fast multi-language serialization framework powered by JIT and zero-copy.4,245Java
apache/dubbo-adminThe ops and reference implementation for Apache Dubbo.4,051Go
apache/httpdMirror of Apache HTTP Server. Issues: http://issues.apache.org3,939C
apache/iggyApache Iggy: Hyper-Efficient Message Streaming at Laser Speed3,865Rust
apache/cordova-androidApache Cordova Android3,776JavaScript
apache/kylinApache Kylin3,767Java
apache/guacamole-serverMirror of Apache Guacamole Server3,726C
apache/nuttxApache NuttX is a mature, real-time embedded operating system (RTOS)3,720C
apache/incubator-heronApache Heron (Incubating) is a realtime, distributed, fault-tolerant stream processing engine from Twitter3,688Java
apache/tikaThe Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).3,590Java
apache/singaa distributed deep learning platform3,588C++
apache/logging-log4j2Apache Log4j is a versatile, feature-rich, efficient logging API and backend for Java.3,584Java
apache/incubator-kie-optaplannerAI constraint solver in Java to optimize the vehicle routing problem, employee rostering, task assignment, maintenance scheduling, conference scheduling and other planning problems.3,481Java
apache/linkisApache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.3,416Java
apache/maven-mvndApache Maven Daemon3,397Java
apache/arrow-rsOfficial Rust implementation of Apache Arrow3,381Rust
apache/luceneApache Lucene open-source search software3,351Java
apache/datafusion-sqlparser-rsExtensible SQL Lexer and Parser for Rust3,324Rust
apache/avroApache Avro is a data serialization system.3,231Java
apache/paimonApache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.3,200Java
apache/curatorApache Curator3,166Java
apache/nutchApache Nutch is an extensible and scalable web crawler3,139Java
apache/parquet-javaApache Parquet Java3,027Java
apache/pdfboxMirror of Apache PDFBox3,024Java
apache/netbeansApache NetBeans3,021Java
apache/hugegraphA graph database that supports more than 100+ billion data, high performance and scalability (Include OLTP Engine & REST-API & Backends)2,973Java
apache/incubator-devlakeApache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.2,945Go
apache/commons-langApache Commons Lang2,941Java
apache/gravitinoWorld's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.2,881Java
apache/grails-coreGrails - the Web Application Framework2,874Groovy
apache/horaedbApache HoraeDB (incubating) is a high-performance, distributed, cloud native time-series database.2,832Rust
apache/cloudstackApache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform2,818Java
apache/cassandra-gocql-driverGoCQL Driver for Apache Cassandra®2,678Go