- Row based with schema
- schema in the file
- schema is json
- block compression, splittable files
- schema evolution
bin/sqlline –u jdbc:drill:schema=dfs;zk=local
http://www.json-generator.com/ <– nice one
http://www.fakemailgenerator.com/ <- really but they receive email
Monitoring and Metrics
Servers and Services
DevOps Techniques, Tools and Processes
One tip that seems obvious, look at what the Internet innovators, large scale startups are doing. They have open sourced many of their tools. Square, Netflix, Google, Twitter, Facebook, LinkedIn,
I. Data in Memory
Persistent Key-Value Store for Java from the excellent Java Advent calendar for 2014.
NoSQL Key-Value stores that run in-memory like Redis are incredibly helpful for scalability. Memcached is also an option. Gemfire is insanely powerful and scales across WANS, it runs some huge transaction sites and is awesome with Java. India Railways is an incredibly interesting example of scaling on in-memory data grids.
II. Microservices / 12 Factor Apps on a PaaS
It’s hard to scale a lumbering beast, easier when you have a swarm of agile services. With each service having an API that can be shared internally or externally you easily open your architecture to extension and usage from outside sources which is key in growing them. This has worked well for Uber, Google, Facebook, Twitter and a host of others.
- Microservices with Microarchitectures
- Microservices with Docker
- Microservices with Spring Boot
- 2015 is the Year of Microservices
The Platform to Enable Microservices: An Open PaaS. With the Linux Foundation running it, CloudFoundry is the open choice, it’s the Tomcat of the PaaS. Extensible, Standard, Fast, Flexible, Elastic and Open — CloudFoundry Rocks.
III. Reactive Programming
Part II – Draft
IV. Rapid Data Ingest
A key piece of the HA puzzle is having the data you need instantly available and
Whether it’s Scala, Map Reduce, Apache Spark, Groovy or Java 8 with Lambdas – functional programming is bringing new ways of increasing performance and solving big data and real-time programming issues.
VI. Message Driven
Disconnected services and running asynchronously greatly improves scalability and keeps one weak link from breaking your data chain.
VII. Netflix Architecture / Spring Cloud Netflix
Internet pioneer Netflix has created a number of amazing tools that keep applications scaling, failing fast and recovering. These tools have been augmented with Spring and are coming to the PaaS. These have with critical portions like Circuit Breakers and discovery. These are very important to manage all the microservices in your architecture. There are also great tools from running on AWS.
X. Continuous Delivery
Having a tool that builds your applications from Git and runs automated tests and static analysis is very critical.
Circuit Breaker for Inter-Service Communication and External Service Access
Reactive Streams / Asynchronous Streams
XI. Responsive Front-Ends / Single Page Applications
WebSockets with SockJS, Stomp. HTML5. AngularJS / Backbone / Ember / …
XII. Polyglot Programming
Play, MEAN, Spring Boot, DropWizard, Ratpack, Sinatra, Rails. There are many different frameworks for building modern web applications.
XIII. Front-End Tooling
Gulp, Grunt, Bower, …
Yarn, CloudFoundry, Kubernetes.
XV. Native Applications
IoS and Android applications have more sensors, better user experience and superior look and feel. There are awesome tools to enable doing this development faster and cleaner.
XVI. Using Netty
XVIV. API Documentation with Swagger