AWS re:Invent Day One Keynote Recap and Thoughts

AWS re:Invent is happening this week and already we saw awesome announcements happen in the day one keynote today. The recording is on Youtube. There were major announcements for AI, IoT, natural language processing, and big data. I'm excited to see what people do with the new services. Here are my highlights and thoughts.

AI & Language

  • Lex - a conversational bot framework and backend, essentially the backend for Amazon Alexa now offered as a service on AWS.
  • Rekognition - image recognition as a service powered by deep learning.
  • Polly - text to speech as a service

We are increasingly seeing the big three in cloud (AWS, Google, Amazon) launch AI and NLP services, and it's great to see. Lots of applications are becoming easier to create with the addition of such services. It does make me wonder, though, will these companies end up owning much of the infrastructure for AI or are there still opportunities for startups? I'm not the only one who debates this.

IoT

  • Greengrass - for syncing data and running applications across IoT devices and providing seamless connectivity up into the cloud. This is more of an application framework for IoT combined with backend hooks. So far I find the explanations of Greengrass to be a bit muddy. It requires a closer look.

Big Data

  • Athena - run SQL on top of data in S3, completely serverless, pay per query. Built on top of Presto, supports a few different file formats. $5 per TB of data queried.

I've been screaming for this for a long time. Athena greatly simplifies running analytics in the cloud. In other words, Amazon just took a major step toward making it pointless to run your own analytics clusters. Before this, creating any sort of production ready, highly available cluster for SQL queries on terabytes of data required a ton of effort - I know cause I've been through that many times. This lowers the barrier of entry so much.

I'm interested in seeing how the cost works out for Athena at $5 per TB queried compared to running your own databases on EC2. There are so many variables in that cost equation. You could lower your S3 footprint using different file formats and compression. You can partition data in different ways to lower the amount of bytes scanned in each query.

For comparison, a d2.8xlarge instance with 48 TB of storage is $5.52 per hour. That's way cheaper if you're querying all 48 TB every hour. However, with the d2.8xlarge you get 36 cores. Athena might be using something like Lambda behind the scenes to run on many more cores, delivering faster results over an equivalent amount of data housed in S3. That only scratches the surface. I can't wait until more people get their hands on Athena and share results.

Compute

  • FPGA instances - this is a first in the cloud as far as I know. This is programmable hardware in the cloud, more specifically, Field Programmable Gate Arrays.

Really curious to see what the world does with this. We'll probably see an explosion of FPGA images being created and shared for all sorts of use cases. This could bring a bunch of people into the FPGA community.

Closing Thoughts

I always look forward to re:Invent. AWS never ceases to amaze. I'm glad to see another re:Invent start strong.