DEV Community: Aravind Putrevu

15 Code Quality and Security Tools Every Developer Should Know

Aravind Putrevu — Fri, 17 Jan 2025 11:03:21 +0000

Quality and security checks are essential parts of modern software development.

As codebases grow and become more complex, automated tooling is key to maintaining standards and preventing regressions.

Here are 15 top-tier tools that integrate with and can help you improve your code quality.

Interestingly, CodeRabbit automatically runs defaults with each of these tools, making your work effortless. It offers Code review comments and 1-click fixes to the issues generated from the code quality tools.

1. Gitleaks (Code Security)

Why It’s Important:

Gitleaks scans repositories for secrets and sensitive information, helping prevent costly data leaks. It’s critical for catching hardcoded passwords, API keys, and other secrets that might accidentally slip into your source control.

Key Features:

Detects secrets in code, configs, and commit history
Configurable rulesets and whitelists
Easy to integrate into CI pipelines

2. Checkov (Code Security & Configuration)

Why It’s Important:

Checkov is a popular security and compliance tool that scans Infrastructure-as-Code (IaC) frameworks such as Terraform, CloudFormation, Helm, and Kubernetes manifests. It helps ensure your infrastructure adheres to best practices before it’s deployed.

Key Features:

Broad support for major IaC frameworks
Detects misconfigurations, security weaknesses, and compliance violations
Extensive rules library maintained by a large community

3. Cppcheck (C/C++ Code Quality)

Why It’s Important:

Cppcheck analyzes C and C++ code, focusing on detecting undefined behavior, memory leaks, and other subtle problems. It’s indispensable for teams writing performance-critical, low-level software.

Key Features:

Finds issues without needing compiled code
Highly configurable to your project’s style and guidelines
Integrates well with CI/CD workflows

4. Hadolint (Dockerfile Scan)

Why It’s Important:

Hadolint checks your Dockerfiles for common pitfalls and inefficiencies. This leads to leaner, more secure, and more maintainable Docker images—ultimately improving your application’s deployment processes.

Key Features:

Warns about deprecated or inefficient instructions
Offers best-practice recommendations for Docker image building
Quick and easy to run locally or in CI

5. golangci-lint (Go Code Quality)

Why It’s Important:

For Go developers, golangci-lint aggregates multiple linters into a single tool. It catches a wide range of issues: style violations, potential bugs, performance concerns, and more.

Key Features:

Runs dozens of linters at once
Fast execution with caching and parallel running
Easily configurable for team-specific rules

6. Detekt (Kotlin Code Quality)

Why It’s Important:

Detekt provides a flexible, customizable approach to analyzing Kotlin projects. It encourages idiomatic code, detects code smells, and enforces style and complexity rules—key for scaling Kotlin codebases.

Key Features:

Checks for code smells, style violations, and complexity issues
Supports custom rule sets
Integrates smoothly into Kotlin build pipelines

7. Markdownlint (Markdown Quality)

Why It’s Important:

Documentation is just as important as code. Markdownlint enforces consistent Markdown formatting, making your documentation easier to read and maintain.

Key Features:

Enforces style rules like heading formatting, line length, and punctuation
Customizable rulesets via configuration files
Enhances readability and consistency in your docs

8. PHPStan (PHP Code Quality)

Why It’s Important:

PHPStan is a static analysis tool for PHP that finds bugs without running your code. It helps ensure that PHP code adheres to best practices, reduces runtime errors, and improves maintainability.

Key Features:

Identifies type errors, undefined variables, dead code, and more
Offers incremental adoption: start at a lower level and increase strictness over time
Highly configurable rules

9. PMD (Java Code Quality)

Why It’s Important:

PMD analyzes Java code to detect common programming flaws. It identifies issues like empty catch blocks, unnecessary object creation, and more subtle code smells, raising overall code quality.

Key Features:

A large set of built-in rules
Custom rule writing for project-specific checks
Compatible with multiple JVM languages

10. Ruff (Python Code Quality)

Why It’s Important:

Ruff is a fast Python linter focused on performance and developer productivity. It aims to be a near drop-in replacement for Flake8, with built-in support for many commonly used Python linting plugins.

Key Features:

Ultra-fast execution
Batteries-included approach: multiple checks and plugins out of the box
Seamless integration into Python projects

11. Rubocop (Ruby Code Quality)

Why It’s Important:

Rubocop enforces the Ruby community’s style guide and detects code smells. It helps maintain consistent, idiomatic Ruby code, making your applications more readable and maintainable.

Key Features:

Vast set of built-in rules aligned with Ruby’s best practices
Autofix capabilities for many style violations
Extensible through custom cop (rule) definitions

12. Semgrep (Code Security)

Why It’s Important:

Semgrep provides secure code scanning for multiple languages. Its rule-based scanning identifies both security issues and logic flaws, bridging the gap between traditional linters and static application security testing (SAST) tools.

Key Features:

Language-agnostic scanning
Customizable rules to target your code’s unique patterns
Real-time feedback in CI/CD pipelines

13. ShellCheck (Shell Scripts Quality)

Why It’s Important:

ShellCheck prevents common mistakes in shell scripts by flagging syntax errors, quoting issues, and logic pitfalls. It’s crucial for ensuring stable, maintainable build and deployment scripts.

Key Features:

Detects subtle shell issues that are easy to miss manually
Provides specific, actionable feedback
Supports multiple shell dialects

14. SwiftLint (Swift Code Quality)

Why It’s Important:

For iOS and macOS developers, SwiftLint enforces Swift style and coding conventions. It catches bad patterns early and ensures consistency across your codebase.

Key Features:

Integrates neatly with Xcode and Swift Package Manager
Offers numerous rules aligned with official Swift style guides
Supports custom rule sets

15. YamlLint (YAML Quality)

Why It’s Important:

YAML is prevalent in configuration files for CI, CD, and infrastructure. YamlLint ensures your YAML files are properly structured and free of formatting errors, preventing configuration headaches down the line.

Key Features:

Detects syntax errors, indentation issues, and trailing spaces
Easily customizable checks
Works with any YAML-based configuration

Wrapping Up

From security checks (Gitleaks, Semgrep) to language-specific quality tools (Ruff for Python, Rubocop for Ruby, SwiftLint for Swift), and configuration validations (YamlLint, Hadolint), these 15 tools represent a cross-section of developer productivity tools.

By integrating them into your workflow, you’ll enhance code consistency, reduce defects, improve maintainability, and safeguard against security vulnerabilities—all while simplifying your review process.

Adopting these tools will not only streamline your code review process but also help your team maintain high standards as your project evolves.

Integrating AI Code Review into your DevOps pipeline

Aravind Putrevu — Fri, 30 Aug 2024 14:06:36 +0000

GitHub is the central hub for countless Open-source projects. However, for repository maintainers, a steady flow of pull requests (PRs) can quickly turn into an overwhelming workload. Those managing popular OSS repositories with numerous stars are well aware of the challenges of keeping up with code reviews, maintaining quality, and keeping the project on track.

A quick search on “has this been abandoned” on Github shows a lot on what’s happening across various successful repositories.

A healthy Maintainer <> Contributor equation has a lot of pieces but Pull-requests plays a substantial role, as it is the biggest point of contact with the repo maintainers after finding an issue one could work on.

We at CodeRabbit have lived through this problem, and with the advent of Generative AI and code generation models, we realized we could help improve this. CodeRabbit is an AI code reviewer designed to ease the challenges of code review, supporting repository maintainers and teams. It not only reviews your PRs but also provides concise summaries, identifies potential issues, and offers insights that might be missed during manual reviews.

How CodeRabbit Works?

Curious about how CodeRabbit works? Here's a breakdown of the process:

CodeRabbit integrates with GitHub, automating the code review process from the moment a pull request is created. It preprocesses the PR content, builds context, leverages Large Language Models for analysis, and then post-processes the AI response before posting the review back to GitHub. This streamlined workflow ensures thorough AI-powered code reviews without manual intervention.

Integrating with GitHub

Accessing CodeRabbit

Navigate to the Code Rabbit login page. You'll be presented with various git options when you try to login. Choose the one, whether it's GitHub, GitLab, or Self hosted Github, or Gitlab

After selecting your Git platform, follow the specific configuration guide:

GitHub: Standard login (steps provided below)
GitLab: Follow standard login and authorization from below steps. For organization-wide use, consider creating a dedicated * GitLab user with a Personal Access Token.
Self-Hosted GitHub: Setup instructions
Self-Hosted GitLab: Setup instructions

Authorization

In you had chosen Login with GitHub& GitLab in step1 , you'll be prompted to authorize CodeRabbit. This step grants the necessary permissions for CodeRabbit to interact with your repositories and pull requests.

Selecting Your Organization

Upon Authorization, If you're part of multiple organizations, you'll have the opportunity to choose which one you want to associate with CodeRabbit. This ensures that you're setting up the tool for the correct team or project.

Exploring the CodeRabbit Dashboard

Upon successful authorization, you'll be logged into the CodeRabbit user interface. Here, you can add repositories and configure CodeRabbit config settings for each repository.

Note 💡 If you opt to authorize all repositories during setup, CodeRabbit will automatically include any new repositories you create on GitHub in the future. This saves you the hassle of manual additions down the line. 5. CodeRabbit Configuration
With your repositories added, it's time to configure CodeRabbit to your needs. You have three options for configuration:

CodeRabbit Configuration You can configure CodeRabbit through a YAML file or using the App’s UI.

You can tailor CodeRabbit's functionality using the coderabbit.yaml file, which you place directly in your GitHub repository. This file mirrors the options available in the CodeRabbit user interface, with each setting in the YAML corresponding to a specific toggle in the UI. Configure CodeRabbit either through the coderabbit.yaml file or the interface, depending on your preference.

💡 If a coderabbit.yaml file exists in your GitHub repository, it takes precedence over any UI settings. Choose either the YAML file or UI configuration - you don't need to use both. The table below outlines key configuration options of the coderabbit.yaml :

Option	Description	Possible Values
'language'	Sets the language for the review.	"en-US", "fr-FR", etc.
'early_access'	Enables early access to features.	true, false
'reviews.profile'	Selects the review style.	"chill", "strict", etc.
'reviews.request_changes_workflow'	Determines if a change request workflow is used.	true, false
'reviews.high_level_summary'	Generates a summary of the PR.	true, false
'reviews.poem'	Adds a creative touch with a poem review.	true, false
'reviews.review_status'	Includes review status in the output.	true, false
'reviews.collapse_walkthrough'	Collapses the review walkthrough.	true, false
'reviews.auto_review.enabled'	Enables auto review for PRs.	true, false
'reviews.auto_review.drafts'	Reviews draft PRs.	true, false
'chat.auto_reply'	Enables automatic replies in the chat.	true, false

Once your coderabbit.yaml file is prepared according to your needs, simply place it in your GitHub repository, and you’re all set—CodeRabbit is now integrated!

When a pull request is created targeting the master branch, CodeRabbit automatically initiates its review process. It analyzes the changes and generates a summary and walkthrough of the modifications. The specific feedback and analysis provided by CodeRabbit are determined by the options you've configured in your YAML file.

Let's examine a few examples of CodeRabbit's review comments from a specific pull request in one of the projects. This particular PR involved in changing the language model from LLaMA 2 to LLaMA 3, for testing purposes. These examples will showcase how CodeRabbit analyzed and commented on this significant model switch.

Sample PR Review Workflow using CodeRabbit

For every PR reviewed, CodeRabbit provides a summary of changes to start with, like the below image.

This image shows CodeRabbit's review status for another pull request. It highlights that 12 actionable comments were generated, and the review also includes additional comments on specific files, demonstrating CodeRabbit's comprehensive analysis of the code changes.

You can also use CodeRabbit commands to chat with the AI code Reviewer.

CodeRabbit could generate a Code sequence diagram when you request a full review. The sequence diagram illustrates the precise flow of interactions between the objects in the system.

Also check out the response when i asked for what improvements can be done on the code level

In addition to providing reviews and summaries, CodeRabbit can also detect configuration issues. For example, I accidentally set up both CodeRabbit Pro (The process we've been discussing) and the open-source version (Refer to different config process) in my repository at the same time.

Interestingly, CodeRabbit noticed this mistake on its own and alerted me. You can see below how it pointed out this issue to me.

Check out for some of the stats and test plans generated by this AI code reviewer for another different project’s pull request.

CodeRabbit also allows you to configure custom review instructions based on your organization's needs, in case you want it to follow specific guidelines beyond the standard review, to learn more on adding custom review instructions

Whether you manage a popular repository or are working on a smaller project, whether it's hosted on GitLab, GitHub, or self-hosted GitHub or GitLab, CodeRabbit can help streamline your development process. This AI Code Review assistant is designed to save you time by automating code reviews and offering insightful feedback.

Explore! Experiment! Discover how CodeRabbit can streamline your code review process using AI!!!

Exploring the most powerful Open LLMs launched till now in June 2024

Aravind Putrevu — Fri, 28 Jun 2024 16:42:15 +0000

In the recent months, there has been a huge excitement and interest around Generative AI, there are tons of announcements/new innovations! It has been great for overall ecosystem, however, quite difficult for individual dev to catch up!

Okay! While diving into this field of Generative AI, one of the commonly used term is LLMs. What are LLMs? (if you are new!)

Large Language Models

Large Language Models (LLMs) are a type of artificial intelligence (AI) model designed to understand and generate human-like text based on vast amounts of data. Think of LLMs as a large math ball of information, compressed into one file and deployed on GPU for inference .

Some of the most common LLMs are OpenAI's GPT-3, Anthropic's Claude and Google's Gemini, or dev's favorite Meta's Open-source Llama.

In this blog, we will be discussing about some LLMs that are recently launched.

Now the obvious question that will come in our mind is Why should we know about the latest LLM trends.

Perhaps, it too long winding to explain it here. Hemant Mohapatra, a DevTool and Enterprise SaaS VC has perfectly summarised how the GenAI Wave is playing out.

There are more and more players commoditising intelligence, not just OpenAI, Anthropic, Google. Every new day, we see a new Large Language Model.

Top LLMs released in this Month

Here is the list of 5 recently launched LLMs, along with their intro and usefulness.

Firefunction-v2

Recently, Firefunction-v2 - an open weights function calling model has been released. Downloaded over 140k times in a week. It is designed for real world AI application which balances speed, cost and performance. It involve function calling capabilities, along with general chat and instruction following.

Key Features:

Enhanced Functionality: Firefunction-v2 can handle up to 30 different functions.
Real-World Optimization: Firefunction-v2 is designed to excel in real-world applications. It can handle multi-turn conversations, follow complex instructions.
Competitive Performance: Firefunction-v2 performs better than GPT-4o in terms of function calling capabilities, scoring 0.81 on various public benchmarks compared to GPT-4o's 0.80.
Cost-Effective and Fast: Firefunction-v2 is much more affordable than GPT-4o, costing only $0.9 per output token compared to GPT-4o's $15.

Link to access

Deepseek v2

DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks.

Excels in coding and math, beating GPT4-Turbo, Claude3-Opus, Gemini-1.5Pro, Codestral.
Supports 338 programming languages and 128K context length.
Fully open-sourced with two sizes: 230B and 16B

Link to model

Meta Chameleon

Meta’s Fundamental AI Research team has recently published an AI model termed as Meta Chameleon. Chameleon is a unique family of models that can understand and generate both images and text simultaneously. This model does both text-to-image and image-to-text generation.

Key Features:

Chameleon is versatile, accepting a combination of text and images as input and generating a corresponding mix of text and images.
It can be applied for text-guided and structure-guided image generation and editing, as well as for creating captions for images based on various prompts.
Additionally, Chameleon supports object to image creation and segmentation to image creation.

Link to Model

Nvidia's NemoTron-4 340B

Nvidia has introduced NemoTron-4 340B, a family of models designed to generate synthetic data for training large language models (LLMs). This innovative approach not only broadens the variety of training materials but also tackles privacy concerns by minimizing the reliance on real-world data, which can often include sensitive information.

NemoTron-4 also promotes fairness in AI. It creates more inclusive datasets by incorporating content from underrepresented languages and dialects, ensuring a more equitable representation.

Another significant benefit of NemoTron-4 is its positive environmental impact. Generating synthetic data is more resource-efficient compared to traditional training methods.

Link to Model

Hermes Theta

Hermes-2-Theta-Llama-3-8B is a cutting-edge language model created by Nous Research. This model is a blend of the impressive Hermes 2 Pro and Meta's Llama-3 Instruct, resulting in a powerhouse that excels in general tasks, conversations, and even specialised functions like calling APIs and generating structured JSON data.

Hermes-2-Theta-Llama-3-8B excels in a wide range of tasks. It helps you with general conversations, completing specific tasks, or handling specialised functions.

Key Features:

Conversational AI Agents: Create chatbots and virtual assistants for customer service, education, or entertainment.
Creative Content Generation: Write engaging stories, scripts, or other narrative content.
Detailed Analysis: Provide in-depth financial or technical analysis using structured data inputs.
Task Automation: Automate repetitive tasks with its function calling capabilities.

Link to model

Note: This list is never ending with Qwen 2 72B etc :D

Interestingly, I've been hearing about some more new models that are coming soon.

As developers and enterprises, pickup Generative AI, I only expect, more solutionised models in the ecosystem, may be more open-source too.

I'd love to see models which can do:

Smarter Conversations: LLMs getting better at understanding and responding to human language. Hold semantic relationships while conversation and have a pleasure conversing with it.
Personal Assistant: Future LLMs might be able to manage your schedule, remind you of important events, and even help you make decisions by providing useful information. We already see that trend with Tool Calling models, however if you have seen recent Apple WWDC, you can think of usability of LLMs.
Learning and Education: LLMs will be a great addition to education by providing personalized learning experiences. Today, they are large intelligence hoarders.

To Conclude

As we have seen throughout the blog, it has been really exciting times with the launch of these five powerful language models.

Each one brings something unique, pushing the boundaries of what AI can do. Whether it's enhancing conversations, generating creative content, or providing detailed analysis, these models really creates a big impact.

At Portkey, we are helping developers building on LLMs with a blazing-fast AI Gateway that helps with resiliency features like Load balancing, fallbacks, semantic-cache. Drop us a star if you like it or raise a issue if you have a feature to recommend!

Portkey-AI / gateway

A Blazing Fast AI Gateway. Route to 200+ LLMs with 1 fast & friendly API.

English | 中文

AI Gateway

Reliably route to 200+ LLMs with 1 fast & friendly API

Gateway streamlines requests to 200+ open & closed source models with a unified API. It is also production-ready with support for caching, fallbacks, retries, timeouts, loadbalancing, and can be edge-deployed for minimum latency.

✅ Blazing fast (9.9x faster) with a tiny footprint (~45kb installed)
✅ Load balance across multiple models, providers, and keys
✅ Fallbacks make sure your app stays resilient
✅ Automatic Retries with exponential fallbacks come by default
✅ Configurable Request Timeouts to easily handle unresponsive LLM requests
✅ Multimodal to support routing between Vision, TTS, STT, Image Gen, and more models
✅ Plug-in middleware as needed
✅ Battle tested over 300B tokens
✅ Enterprise-ready for enhanced security, scale, and custom deployments

How to Run the Gateway?

Run it Locally for complete control & customization
Hosted by Portkey for quick setup without…

View on GitHub

How Generative AI is impacting Developer Productivity?

Aravind Putrevu — Fri, 28 Jun 2024 14:36:25 +0000

Ever since ChatGPT has been introduced, internet and tech community have been going gaga, and nothing less!

Icing on the cake is you can NOW generate even code :wow:

Over the years, I've used many developer tools, developer productivity tools, and general productivity tools like Notion etc. Most of these tools, have helped get better at what I wanted to do, brought sanity in several of my workflows.

However, with Generative AI, it has become turnkey. Imagine, I've to quickly generate a OpenAPI spec, today I can do it with one of the Local LLMs like Llama using Ollama.

In this blog, we'll explore how generative AI is reshaping developer productivity and redefining the entire software development lifecycle (SDLC).

Traditional Software Development Life Cycle looks like this:

This process is complex, with a chance to have issues at each stage. While perfecting a validated product can streamline future development, introducing new features always carries the risk of bugs.

Machine Learning and Developer Productivity

Even before Generative AI era, machine learning had already made significant strides in improving developer productivity.

Observability into Code using Elastic, Grafana, or Sentry using anomaly detection.
Build-time issue resolution - risk assessment, predictive tests.
Code quality improvement through static analysis
and more...

Rise of Generative AI

GPT-2, while pretty early, showed early signs of potential in code generation and developer productivity improvement. However, its knowledge base was limited (less parameters, training technique etc), and the term "Generative AI" wasn't popular at all.

GPT-3

The introduction of ChatGPT and its underlying model, GPT-3, marked a significant leap forward in generative AI capabilities.

This breakthrough has impacted both B2C and B2B sectors, particularly in the realm of business-to-developer interactions.

Some Highlights are:

Code Interpreter in ChatGPT
Improved Python/other Programming Language code generation
Enhanced problem-solving capabilities, potentially reducing reliance on platforms like Stack Overflow

Note: It's important to note that while these models are powerful, they can sometimes hallucinate or provide incorrect information, necessitating careful verification.

Specialized Code Generation Models

Following ChatGPT's success, several code generation models have emerged:

These models show promising results in generating high-quality, domain-specific code. AI-Powered Development Environments building on the success of GitHub Copilot, have emerged:

Cursor: An AI-first code editor
Tabnine: AI-assisted code completion
Continue.dev

Note: If you are a CTO/VP of Engineering, it'd be great help to buy copilot subs to your team. There are tons of good features that helps in reducing bugs, reducing overall fatigue in building good code.

These tools aim to streamline the coding process and boost developer
productivity.

Agentic AI and Workflow Automation

As generative AI continues to evolve, we're seeing the emergence of agentic AI frameworks that can potentially automate entire development workflows:

Open-source Tools like Composeio further help orchestrate these AI-driven workflows across different systems bring productivity improvements.

The Next Leap in Developer Productivity

Generative AI is poised to revolutionise developer productivity, potentially automating significant portions of the SDLC.

While human oversight and instruction will remain crucial, the ability to generate code, automate workflows, and streamline processes promises to accelerate product development and innovation.

At Middleware, we're committed to enhancing developer productivity our open-source DORA metrics product helps engineering teams improve efficiency by providing insights into PR reviews, identifying bottlenecks, and suggesting ways to enhance team performance over four important metrics.

As we continue to witness the rapid evolution of generative AI in software development, it's clear that we're on the cusp of a new era in developer productivity. The challenge now lies in harnessing these powerful tools effectively while maintaining code quality, security, and ethical considerations.

Let us know what you think? What I missed on writing here? OR you completely feel like Jayant, who feels constrained to use AI?

How to create LLM fallback from Gemini Flash to GPT-4o?

Aravind Putrevu — Thu, 13 Jun 2024 14:09:03 +0000

Generative AI has been the hottest technology trend from an year enterprises to startups. Almost every brand is incorporating GenAI and Large Language Models (LLM) in their solutions.

However, an under explored part of Generative AI is the managing resiliency. It is easy to build on a API provided by a LLM vendor like OpenAI, however it is hard to manage if the vendor comes across a service disruption etc.

In this blog, we will take a look at how you can create a resilient generative ai application that switches between GPT-4o to Gemini Flash by using open-source ai-gateway's fallback feature.

Before that..

What is a fallback?

In a scenario involving APIs, if the active endpoint or server goes down, as part of a fallback strategy for high availability using a load balancer, we configure both active and standby endpoints. When the active endpoint goes down, one of the configured secondary endpoints takes over and continues to serve the incoming traffic.

Why do we need fallbacks?

Basically fallbacks ensure application resiliency in disaster scenario's and help aid in quick recovery.

Note: In many cases, during recovery a loss of incoming traffic (such as HTTP requests) is a common phenomena.

Why fallbacks in LLMs?

In the context of Generative AI, having a fallback strategy is crucial to manage resiliency. A traditional server resiliency scenario is no different than in the case of Generative AI. It would imply if the active LLM becomes unavailable, one of the configured secondary LLM takes over and continues to serve incoming requests, thereby maintaining uninterrupted solution experience for users.

Challenges in creating fallbacks for LLMs

While fallbacks in concept for LLMs looks very similar to managing the server resiliency, in reality, due to the growing ecosystem and multiple standards, new levers to change the outputs etc., it is harder to simply switch over and get similar output quality and experience.

Moreover, the amount of custom logic and effort that is needed to add this functionality with changing landscape of LLMs and LLM providers will be hard for someone whose core business is not managing LLMs.

Using open-source AI Gateway to implement fallbacks

To demonstrate fallbacks feature, we'll be building a sample Node.js application and integrating Google's Gemini. We'll be using the OpenAI SDK and Portkey's open-source AI Gateway to demonstrate the fallback to GPT.

If you are new to AI Gateway, you can refer our previous post to learn features of open-source AI Gateway.

Creating Node.js Project

To start our project, we need to set up a Node.js environment. So, let's create a node project. Below command will initialize a new Node.js project.

npm init

Install Dependencies

Let's install the required dependencies of our project.

npm install express body-parser dotenv

This will install the following packages:

express: a popular web framework for Node.js
body-parser: middleware for parsing request bodies
portkey-ai: a package that enables us for accessing the multiple ai models
dotenv: loads environment variables from a .env file

Setting Environment Variables

Next, we'll create a .env folder to securely store our sensitive information such as API credentials.

//.env
GEMINI_API_KEY=YOUR_API_KEY
PORT=3000

Get API Key

Before using Gemini, we need to set up API credentials from Google Developers Console. For that, We need to sign up on our Google account and create an API key.

Once signed in, Go to Google AI Studio.

Click on the Create API key button. It will generate a unique API Key that we'll use to authenticate requests to the Google Generative AI API.

After getting the API key we'll update the .env file with our API key.

Create Express Server

Let's create a index.js file in the root directory and set up a basic express server.

const express = require("express");
const dotenv = require("dotenv");

dotenv.config();

const app = express();
const port = process.env.PORT;

app.get("/", (req, res) => {
  res.send("Hello World");
});

app.listen(port, () => {
  console.log(`Server running on port ${port}`);
});

Here, We're using the "dotenv" package to access the PORT number from the .env file.

At the top of the project, we're loading environment variables using dotenv.config() to make it accessible throughout the file.

Executing the project

In this step, we'll add a start script to the package.json file to easily run our project.

So, Add the following script to the package.json file.

"scripts": {
  "start": "node index.js"
}

The package.json file should look like below:

Let's run the project using the following command:

npm run start

Above command will start the Express server. Now if we go to this URL http://localhost:3000 we'll get this:

The Project setup is now done. Next up, we'll adding Gemini to our project in the next section.

Adding Google Gemini

Set up Route

To add the Gemini to our project, We'll create a /generate route where we'll communicate with the Gemini AI.

For that add the following code into the index.js file.

const bodyParser = require("body-parser");
const { generateResponse } = require("./controllers/index.js");

//middleware to parse the body content to JSON
app.use(bodyParser.json());

app.post("/generate", generateResponse);

Here, We're using a body-parser middleware to parse the content into a JSON format.

Configure OpenAI Client with Portkey Gateway

Let's create a controller folder and create a index.js file within it.

Here, we will create a new controller function to handle the generated route declared in the above code.

First, we'll Import the Required packages and API keys that we'll be using.

Note: Portkey adheres to OpenAI API compatibility. Using Porktey AI further enables you to communicate to any LLM using our universal API feature.

import OpenAI from 'openai';
import dotenv from "dotenv";
import { createHeaders } from 'portkey-ai'

dotenv.config();
const GEMINIKEY = process.env.GEMINI_API_KEY;

Then, we'll instantiate our OpenAI client and pass the relevant provider details.

const gateway = new OpenAI({
  apiKey: GEMINIKEY,
  baseURL: "http://localhost:8787/v1",
  defaultHeaders: createHeaders({
    provider: "google",
  })
})

Note: To integrate the Portkey gateway with OpenAI, We have

Set the baseURL to the Portkey Gateway URL

Included Portkey-specific headers such as provider and others.

Implement Controller Function

Now, we'll write a controller function generateResponse to handle the generation route (/generate) and generate a response to User requests.

export const generateResponse = async (req, res) => {
  try {
    const { prompt } = req.body;

    const completion = await gateway.chat.completions.create({
      messages: [{ role: "user", content: prompt}],
      model: 'gemini-1.5-flash-latest',
    });

    const text = completion.choices[0].message.content;

    res.send({ response: text });

  } catch (err) {
    console.error(err);
    res.status(500).json({ message: "Internal server error" });
  }
};

Here we are taking the prompt from the request body and generating a response based on the prompt using the gateway.chat.completions.create method.

Run Gateway Locally

To run the gateway locally, run the following command in your terminal

npx @portkey-ai/gateway

This will spin up the gateway locally and it’s running on http://localhost:8787/

Run the project

Now, we have to check if our app is working correctly or not!

Let's run our project using:

npm run start

Validating Gemini's Response

Next, we'll make a Post request using Postman to validate our controller function.

We'll send a POST request to http://localhost:3000/generate with the following JSON payload:

{
  "prompt": "Are you an OpenAI model?"
}

And We got our response:

{
    "response": "I am a large language model, trained by Google. \n"
}

Great! Our Gemini AI integration is Working as expected!

Adding Fallback using AI Gateway

Till now, project is working as expected. But what if Gemini's API doesn't respond?

As discussed earlier, a resilient app yields better customer experience.

That's where Portkey's AI Gateway shines. It has a fallback feature that seamlessly switch between them based on their performance or availability.

If the primary LLM fails to respond or encounters an error, AI Gateway will automatically fallback to the next LLM in the list, ensuring our application's robustness and reliability.

Now, let's add fallback feature to our project!

Create Portkey Configs

First, we'll create a Portkey configuration to define routing rules for all the requests coming to our gateway. For that, Add the following Code:

const configObj = {
  "strategy": {
    "mode": "fallback"
  },
  "targets": [
    {
      "provider": "google",
      "api_key": GEMINIKEY // Add your Gemini API Key
    },
    {
      "provider": "openai",
      "api_key": OpenAIKEY,
      "override_params": {
        "model": "gpt-4o"
      }
    }
  ]
}

This config will fallback to OpenAI's gpt-4o if Google's gemini-1.5-flash-latest fails.

Update OpenAI Client

To add the portkey config in our OpenAI client, we'll simply add the config id to the defaultHeaders object.

const gateway = new OpenAI({
  apiKey: GEMINIKEY,
  baseURL: "http://localhost:8787/v1",
  defaultHeaders: createHeaders({
    provider: "google",
    config: configObj
  })
})

Note: If we want to attach the configuration to only a few requests instead of modifying the client, we can send it in the request headers for OpenAI. For example:
let reqHeaders = createHeaders({config: configObj});
openai.chat.completions.create({
  messages: [{role: "user", content: "Say this is a test"}],
  model: "gpt-3.5-turbo"
}, {headers: reqHeaders})
Also, If you have a default configuration set in the client, but also include a configuration in a specific request, the request-specific configuration will take precedence and replace the default config for that particular request.

That's it! Our Setup is done.

Testing the Fallback

To see if our fallback feature is working or not, we'll remove the the Gemini API key from the .env file. And, We'll send a POST request to http://localhost:3000/generate with the following JSON payload:

{
  "prompt": "Are you an OpenAI model?"
}

And We'll get this response:

{
    "response": "Yes, I am powered by the OpenAI text generation model known as GPT-4o."
}

Awesome! This Means Our Fallback feature is Working perfectly!

As we have deleted the Gemini API key, the First request failed, and Portkey Automatically detected that and automatically fallback to the next LLM in the list that is OpenAI's gpt-3.5-turbo .

Conclusion

In this article, we have explored how to integrate Gemini in our node.js application, also how to leverage AI Gateway’s fallback feature when Gemini is not available.

If you want to know more about Portkey's AI Gateway and give us a star, join our LLMs in Production Discord to hear more about what other AI Engineers are building.

Happy Building!

Understanding RAG: A Deeper Dive into the Fusion of Retrieval and Generation

Aravind Putrevu — Wed, 21 Feb 2024 09:22:22 +0000

Retrieval-Augmented Generation (RAG) models represent a fascinating marriage of two distinct but complementary components: retrieval systems and generative models. By seamlessly integrating the retrieval of relevant information with the generation of contextually appropriate responses, RAG models achieve a level of sophistication that sets them apart in the realm of artificial intelligence.

How does RAG works?

Imagine you're planning a trip to a foreign country and you want to learn about its culture, history, and local attractions. You start by consulting a well-informed travel agent (retrieval system) who has access to a vast library of guidebooks and travel articles. You provide the agent with your interests and preferences (query), and they sift through their resources to find the most relevant information.

Once they've gathered all the necessary details, they pass them on to a talented tour guide (generative model) who crafts a personalized itinerary tailored to your tastes.

This itinerary seamlessly blends the information provided by the travel agent with the guide's own expertise, resulting in a comprehensive and engaging travel plan.

Architecture of RAG Models

Let's break down the architecture of RAG models into its constituent parts:

Query Processing: This is where the journey begins. When a query is submitted to a RAG model, it undergoes a process of analysis and interpretation to discern the context and intent behind it.

Like a human travel agent, who collects more information about weather, location, budget, food choice from a traveller.

Document Retrieval: Once the query has been processed, the retrieval system springs into action. Drawing upon its extensive database of documents, the retrieval system embarks on a quest to find the most pertinent information related to the query. It sifts through mountains of data, searching for nuggets of knowledge that will help illuminate the user's inquiry.

Just as a travel agent understands the traveller's needs and preferences to recommend the best itinerary, RAG carefully analyses the user's query to identify their intent and picks the most appropriate places of attraction, city under given budget guidelines.

Response Generation: With the relevant information in hand, it's time for the generative model to work its magic. Like a skilled storyteller weaving together threads of narrative, the generative model synthesizes the retrieved information with its own internal knowledge to craft a response that is both coherent and contextually appropriate. Drawing upon its vast reservoir of linguistic patterns and semantic understanding, the model generates text that is not only factually accurate but also engaging and insightful.

Finally, A good travel agent is like a personal concierge, working tirelessly to gather all the necessary details and craft the perfect itinerary for their clients. Just as a generative model creates a cohesive story, a travel agent artfully combines flights, hotels, and activities to create a seamless and enjoyable travel experience.

Bringing it All Together

What makes RAG models truly remarkable is the synergy between their retrieval and generation components. Like two dancers moving in perfect harmony, these components work together to create responses that are greater than the sum of their parts. By leveraging the precision and depth of retrieval systems alongside the creativity and fluency of generative models, RAG models are able to tackle a wide range of tasks with unparalleled sophistication and accuracy.

Platforms for Building RAGs: Exploring the Options

When it comes to building Retrieval-Augmented Generation (RAG) models, developers have access to a variety of platforms and tools that streamline the development process and offer integrated environments for experimentation and deployment. There are LLM platforms, Chunker & Retrievers, Complete Frameworks.

Let's take a closer look at some of the prominent platforms available:

OpenAI: OpenAI offers an API that provides access to powerful generative models, including GPT-3, which can be seamlessly integrated with retrieval systems to build RAG models. The API provides developers with a simple and intuitive interface for interacting with state-of-the-art language models, making it an ideal choice for building RAG applications.

Hugging Face's Transformers Library: Hugging Face's Transformers library is a comprehensive toolkit for natural language processing tasks, including support for RAG models. The library offers pre-trained models, fine-tuning capabilities, and a wide range of utilities for working with transformer-based architectures. With its extensive documentation and active community support, Hugging Face's Transformers library is a popular choice among developers for building RAG models.

LangChain and LlamaIndex: These open-source libraries were founded in late 2022 and have gained significant adoption in the RAG community. LangChain and LlamaIndex provide developers with tools and frameworks for building RAG pipelines, including support for retrieval systems, generative models, and prompt engineering. With their modular design and emphasis on flexibility, LangChain and LlamaIndex offer developers the freedom to customize and experiment with different components of RAG models.

OpenLLaMA and Falcon: These are other open-source options for building RAG models. OpenLLaMA and Falcon provide developers with access to a range of tools and resources for constructing RAG pipelines, including support for vector search engines, language models, and integration with external data sources. With their active development communities and growing ecosystems, OpenLLaMA and Falcon offer promising opportunities for building and deploying RAG applications.

Conclusion

In conclusion, Retrieval-Augmented Generation (RAG) models represent a groundbreaking approach to natural language processing that combines the power of retrieval systems and generative models to produce highly sophisticated and contextually relevant responses.

That being said, deploying LLMs and building RAGs in production is not easy, we at Portkey are building a community for Generative AI builders. Come join us and let us know your problems. You can also showcase your GenAI product or solution to our user community.

Three Interesting Apache Opensource Projects that you should take a look at!

Aravind Putrevu — Wed, 02 Dec 2020 17:52:37 +0000

Nowadays, I see many devs write about "Cloud-native" software projects like Kubernetes, Prometheus, Containerd, Envoy. But there are a few other similar OSS foundation backed projects which are excellent too.

This post is about three such projects from "Apache Software Foundation (ASF)" that one should take a look at.

ASF is a very old Opensource organization which has been the home for projects like Hadoop, Zookeeper, Kafka, Lucene. If you are interested in contributing to OSS, this might be the best place to start too.

1. Apache Airflow

Apache Airflow is an open-source workflow management platform started at Airbnb. It essentially helps with many operations in a software project. It can monitor cron jobs, manage data pipelines. It generates a Directed Acyclic Graph (DAG) visualization diagram. But, it is not a streaming or ETL solution.

Cloud Platforms like GCP has Cloud Composer, AWS has Managed Workflows - which are managed SaaS versions of Airflow.

Github Repository

2. Apache Beam

Apache Beam is a combination of the Batch and Streaming model to design and develop data processing pipelines. There are Java, Python, Go SDKs available for Apache Beam. It also supports backends like Apache Flink, Apache Spark, and GCP Dataflow.

Github Repository

3. Apache Pulsar

Apache Pulsar is popular is a cloud-native, distributed messaging and streaming platform created at Yahoo. You might have heard it while researching the famous OSS message queue Kafka. Born in the cloud-native world, Pulsar can be run on Docker or Kubernetes. Pulsar has built-in connectors to MongoDB, Elasticsearch, PostgreSQL, Redis.

Github Repository

I believe these projects are interesting to learn and try even on a side-project. I might have missed several other interesting projects or incubating one's. If you are using or have used something, write them in the comments.

--
Stay Safe
Aravind Putrevu

Secure your Elasticsearch Cluster

Aravind Putrevu — Wed, 02 Dec 2020 07:13:55 +0000

TL;DR - Basic Authentication, RBAC in Elasticsearch is free. You should enable it and protect your cluster from attacks, data breaches.

The cost of a data breach is huge to any organization. Be it a Bank, eCommerce Company, or an early-stage startup.

It is a loss to your customers. There is an intangible loss of reputation. And then, the regulatory problems from the authorities.

Imagine you open your email one morning, that the data from your Elasticsearch cluster is deleted and you have to pay some money to the attacker to get the data. Sounds troublesome, isn't it?

What security features are available as free?

More than a year ago, Elastic made their core security features of the Elastic Stack are free. It means you can enable and leverage TLS encryption between nodes, role-based access control, basic authentication.

You can leverage these features to protect the data sent via Logstash and data shippers like Filebeat.

Sidenote: You can use the secure keystore's in Filebeat and Logstash to store sensitive settings.

How to check if my ES cluster is unprotected?

Option 1:

Usually, the ES cluster is bootstrapped to start on the port number 9200. You can check whether if you could access, the cluster endpoint remotely using an API client like postman or hopscotch or as simple as chrome browser in incognito mode.

<endpoint_url>:9200

You should be getting the below message.

{
    "error": {
        "root_cause": [
            {
                "type": "security_exception",
                "reason": "action [cluster:monitor/main] requires authentication",
                "header": {
                    "WWW-Authenticate": [
                        "Basic realm=\"security\" charset=\"UTF-8\"",
                        "Bearer realm=\"security\"",
                        "ApiKey"
                    ]
                }
            }
        ],
        "type": "security_exception",
        "reason": "action [cluster:monitor/main] requires authentication",
        "header": {
            "WWW-Authenticate": [
                "Basic realm=\"security\" charset=\"UTF-8\"",
                "Bearer realm=\"security\"",
                "ApiKey"
            ]
        }
    },
    "status": 401
}

Option 2:

Starting Kibana 7.10, you will also see a popup in Kibana if your cluster isn't secure.

What more?

With Users and Roles, you can multiple users and assign them specific roles to limit their access to the cluster data.

You can create a Kibana space to limit access to users for a specific dashboard.

Thanks for reading it this far. If you have more questions on Elasticsearch security, please join the community slack group or ask a question on our technical forum or you could try spinning up a cluster using this link.

Happy to help!

--
Stay Safe
Aravind Putrevu

Running Elasticsearch on Kubernetes from Azure Kubernetes Service

Aravind Putrevu — Thu, 22 Oct 2020 11:39:57 +0000

It's safe to say that Kubernetes is the de facto standard for orchestrating containers and the applications running in them. As the standard, a variety of managed services and orchestration options are available to choose from. In this blog post, we're going to take a look at running the Elastic Stack on Azure Kubernetes Service (AKS) using Elastic Cloud on Kubernetes (ECK) as the operator.

Elastic Cloud on Kubernetes, is the official operator for running the Elastic Stack on Kubernetes. ECK helps to manage, scale, upgrade, and deploy the Elastic Stack securely. In the steps below, we will deploy ECK on AKS and then use that deployment to collect logs, metrics, security events from a virtual machine on Azure.

Here's what we'll do:

Create an Azure Kubernetes Service cluster
Install Elastic Cloud on Kubernetes
Create an Elasticsearch cluster
Deploy Kibana
Create an Azure VM for us to monitor
Deploy Metric beat to collect VM metrics and events

Deploying AKS, ECK, Elasticsearch, and Kibana

Note: You need to have Azure Account and the Azure CLI for Microsoft Azure installed to run some platform-specific commands. This helps you to create your cluster using this Azure CLI command.

Step 1: Create an AKS cluster

az aks create --resource-group resourceGroupName --name clusterName --node-count 3 --generate-ssh-keys

Step 2: Connect to the AKS cluster

az aks get-credentials --resource-group resourceGroupName --name clusterName

Step 3: Install the ECK operator

kubectl apply -f https://download.elastic.co/downloads/eck/1.1.2/all-in-one.yamlkubectl -n elastic-system logs -f statefulset.apps/elastic-operator

Step 4: Create an Elasticsearch cluster with an external IP

We're using the default load balancer that is available with Azure Kubernetes Service.

cat <<EOF | kubectl apply -f -apiVersion: elasticsearch.k8s.elastic.co/v1 
kind: Elasticsearch 
metadata: 
  name: quickstart 
spec: 
  version: 7.9.2 #Make sure you use the version of your choice 
  http: 
    service: 
      spec: 
        type: LoadBalancer #Adds a External IP 
  nodeSets: 
  - name: default 
    count: 1 
    config: 
      node.master: true 
      node.data: true 
      node.ingest: true 
      node.store.allow_mmap: false 
EOF

Step 5: Monitor the cluster creation

kubectl get elasticsearch

kubectl get pods -w

Step 6: Check the logs of the pod created

kubectl logs -f quickstart-es-default-0

kubectl get service quickstart-es-http

Step 7: Retrieve the password of Elasticsearch cluster

PASSWORD=$(kubectl get secret quickstart-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode)

curl -u "elastic:$PASSWORD" -k "https://<IP_ADDRESS>:9200"

Note: The public IP address of Elasticsearch can be picked by running:
kubectl get svc quickstart-es-http

Step 8: Deploy Kibana

cat <<EOF | kubectl apply -f - 
apiVersion: kibana.k8s.elastic.co/v1 
kind: Kibana 
metadata: 
  name: quickstart 
spec: 
  version: 7.9.2 #Make sure Kibana and Elasticsearch are on the same version. 
  http: 
    service: 
      spec: 
        type: LoadBalancer #Adds a External IP 
  count: 1 
  elasticsearchRef: 
    name: quickstart 
EOF

Step 9: Monitor the Kibana deployment

kubectl get kibana

Ingesting and analyzing Azure metrics

Now that we have created an Elasticsearch cluster with Kibana in AKS, let's go ahead and ingest some observability data from Azure Cloud itself. Filebeat and Metricbeat make this easy by coming with out-of-the-box an Azure Module, helping to easily gather logs (activity, sign in, audit) and metrics (vm, container registry, billing) from Azure Cloud Platform.

In this tutorial, we will install Metricbeat on an Azure VM and enable the Azure cloud module. Before that, we also need to have credentials to authenticate with Azure Monitor REST API which uses Azure Resource Manager authentication model.

We need to have client_id, client_secret, subscription_id, tenant_id which can be obtained by creating an Azure Active Directory App. You can use this guide to Azure AD application and service principal that can access resources.

Step 1: Create a Azure VM and SSH into the VM

az vm create \ 
  --resource-group myResourceGroup \ 
  --name myVM \ 
  --image UbuntuLTS \ 
  --admin-username azureuser \ 
  --generate-ssh-keys ssh azureuser@<IP_ADDRESS>

Step 2: Install Metricbeat

wget https://artifacts.elastic.co/downloads/beats/metricbeat/metricbeat-7.9.2-amd64.deb

Step 3: Configure Elasticsearch and Kibana credentials in Metricbeat

This helps us to ship the data to Elasticsearch Cluster created on AKS as well as load dashboards in Kibana.

vim /etc/metricbeat/metricbeat.yml

 setup.kibana: 
   host: "https://<public_ip_addr>:5601"

Note: The public IP address of Kibana can be picked by running:
kubectl get "kubectl get svc quickstart-kb-http"

vim /etc/metricbeat/modules.d/azure.yml.disabled

Replace the client_id, client_secret, subscription_id, tenant_id for all the metricsets listed in the yml file. A sample might look like this.

- module: azure 
  metricsets: 
  - monitor 
  enabled: true 
  period: 300s 
  client_id: '8dec1ab1-1691-48a6-af43-f87de68e971b' 
  client_secret: '~fwL-MhOcguaD2yK1e_.OWHhhqwdp-p974' 
  tenant_id: 'aa40685b-417d-4664-b4ec-8f7640719adb' 
  subscription_id: '70bd6e77-4b1e-4835-8896-db77b8eef364' 
  refresh_list_interval: 600s 
  resources: 
  - resource_query: "resourceType eq 'Microsoft.DocumentDb/databaseAccounts'" 
    metrics: 
    - name: ["DataUsage", "DocumentCount", "DocumentQuota"] 
      namespace: "Microsoft.DocumentDb/databaseAccounts"

Step 4: Enable Azure Module and start Metricbeat

cd /usr/bin/

./metricbeat modules enable azure
./metricbeat setup --dashboards
./metricbeat -e

Step 5: Monitor Metrics of Azure in Kibana

Log into Kibana and head over to Dashboards. Search for "Azure" to look at several preconfigured dashboards regarding storage, database, billing. Here is what a sample monitoring dashboard would look like:

Wrapping up

And that's that! You have successfully built a secure Elastic Stack deployment on a managed Kubernetes service. You can also deploy other applications like Elastic APM or Elastic Workplace Search. In addition to that, you can also enable cross-cluster search and replication, which enables you to deploy an Elastic Stack on Kubernetes cluster across regions to serve users.

We encourage you to try ECK for yourself (on any Kubernetes service), and if you have further questions related to this blog post, you can always reach us out on our community discussion forums or Slack workspace.

If you'd like to learn more about how Elastic can help with Kubernetes observability, check out our Best of Elastic Observability webinars series

You can follow me at Aravind Putrevu

The most promising breakthroughs from Google I/O 2017

Aravind Putrevu — Fri, 02 Jun 2017 12:11:42 +0000

GoogleÂ I/O is one of the biggest developer conferences. This year was particularly exciting. There were two keynotes: one from Google CEOÂ Sundar Pichai, and another fromÂ Jason Titus, the Google’s vice president of developer products.

In this article I’ll summarize the major announcements from Google I/O 2017 and also share my own perspectives on them as a developer.

Let’s getÂ started!

I liked the way they kicked off their event this year by showing theirÂ “The Story of an Idea”Â video clip. This symbolizes how, when one looks at taking an idea forward, they may face many obstacles. But finally when the world starts looking at the effort and hardÂ work, that’s when those efforts really start to shine.

The keynote starts with Sundar describing prime products of Google and the scale at which those services are operated.

InÂ the past few years, many of us haven’t gone a single day without using Google Search, Gmail, YouTube, and Android, or some other Google product. That shows the kind of engineering excellence Google owns in bringing people again and again to the same platform.

From Mobile First to AIÂ First

The main competitors of Google might be leading in certain sectors such as Cloud andÂ Social Media. But Google has an edge over them as it has muchÂ more of world’s data. All this data makes it much easier for them to integrateÂ AI into their products.

As a result, today we see slew of new features added to each Google product without compromising on User experience.

####“The more we can democratize access to AI technology, 
the sooner everyone will benefit.”â€Š–â€ŠSundar Pichai

Google Lens

Google is not only successful in bringing down the error rates for its speech recognition, but it’s actually nearing a typical human’s ability to understand speech. Today Google HomeÂ understands different voices in a home, and gives results accordingly.

Along similar lines, Google Lens will enable machines to look out and describe what they see.

Cloud Tensor Processing Unit (TPU)

As a machine learning enthusiast, I know how difficult it is today to train a machine learning model. So imagine a situation where there’s a huge data set and a machine learning model written. It’s not easy to train the model on any typical computer (like your laptop) with less computing power. For this, you need a machines with more processing power so you can perform these tasks in a cost-effective way.

Since a majority of cloud vendors charge based on time, cost to do the task increases as time increases. With a better underlying infrastructure, one can train the model with more data points in the given time.

Because the more a model gets trained on a wide variety of data, the more accurate the inferences become.

In this context Cloud TPU (2nd Gen TPU) is going to be a major innovation. With Cloud TPU integrated into Google Cloud Platform (GCP), we can expect that GCP can become a go-to place for all machine learning use-cases.

Google.ai

Google always talks about open-source and the democratization of its technology. It also does that for the ecosystem to thrive.Â TensorflowÂ is a good example of this.

Similarly, I believeÂ Google.aiÂ will serve as a one place to find all the AI work done by Google.

AutoMLÂ is crazy and shows how far Google can take AI.Â An example for this isÂ neural net building another neural net. Remember we are still inÂ 2017.

Applying machine learning advancements to the healthcare domain (Diabetic Retinopathy,Â Breast Cancer Diagnosis) reminds us that technology can change lives and help us live longer.

On that note, check out this video of a high school student’s effort to solve one of the toughest medical prediction problems.

On the fun side, have you heard ofÂ AutoDraw? Head over to it, you’ll get to see AI in action.

Google Assistant

Google Search remains Google’s main source of income.

Google Assistant is essentially Google Searchâ€Š–â€Šbut it uses voice as its modality.Â By getting people to use Google Assistant, they’reÂ gettingÂ users to search more often,Â andÂ introducing more interesting features like making it more conversational, injecting it into all form factors, and finally opening it as aÂ platform.

To learn more about developing “Actions on Google”_Â _take a look at this wonderfulÂ API.aiÂ demo.

Google Home

Google Assistant more or less drivesÂ Google Home. It has positioned itself as competitor to theÂ Amazon Alexa Platform.

Now we might assume thatÂ GoogleÂ is playing catch-up to Amazon, but with the amount of data that Google has, and the daily user interaction with Google services, it may be able to out-class Alexa.

Many people will be watching to see how this battle turns out.

Excited to be a part of #io17. Check out all that’s coming to #GoogleHome: https://t.co/Q6lZJJzOsa pic.twitter.com/j6cCbvSYht
— Made by Google (@madebygoogle) May 17, 2017

Here are some of Google Home’s core features:

proactive assistance
hands-free calling
visual responses to a suitable nearby screen
integration with Google’s existing services, like Maps, Chromecast, Calendar, and YouTube

Google Photos

Right from the day it is unveiled, Google Photos (previously known as Picasa) was a tremendous hit. Again, with the ability to analyze and organize information, Google is bringing up yet another feature: Suggested Sharing.This relieves of the time-intensive task of making photo albums and sharing them with loved ones. Now it’s just a single click.

Ditto with shared libraries and photo books. And there’s a clever integration with Google Lens. Google’s applying machine learning to everything.

YouTube

Susan Wojcicki, CEO of YouTube, brings up an interesting demonstration of platform’s ability to solve social issues. It reminds me of the original promised power of social media.

Today, there’s no equivalent open video service to YouTube, which makes big money. Yet Google isn’t resting on its laurels. It’s introducing new features into YouTube in all possible form factors, be it your computer, smart phone, or TV.

With 360 video on the @YouTube app, you'll feel like you're the middle of the action from your couch, on the biggest screen you own. #io17 pic.twitter.com/Y3LmQaKD54
— Google (@Google) May 17, 2017

Android

Oh! How can I forget it? Most popularly used mobile operating system in the world.

Google is going full speed ahead with a new version of Android. And it’s not just patching thingsâ€Š–â€Šthese are some major new features.

As an Android developer, I’d like to congratulate and thank the entire team of Android for yet another beautiful release.

Even though Android is popular in its own sense, it still has few issues, like battery life and operating system version fragmentation. But with every new release, Google is tackling these problems by making Android more robust.

AndroidÂ O improves the user experience with features such asÂ Picture in Picture, Notification Dots, Autofill,Â _andÂ Smart Text selection_.

Out of which my favorite isÂ Picture in Picture.Â Even though multi-window features were available in previous releases of Android, this time multi-window is mostly adaptable to a form factor of big sizes (including tablets).

On the machine learning side of things, I believe “TensorflowLite and the new hardware supported neural network API will let any phone run simple models locally for faster benefits. Of course, we’ll need to learn how to use it properly so we don’t hamper user experience.

Other vitals like battery life, security, startup time, and stability are important, and are the real needs of the hour.

After working at a security company, I’ve seen first-hand how many Android apps do things in a sort of gray area, against the platform’s rules.

To find such apps, “Google Play Protect comes into action and scans apps and makes sure things are in line. Google doesn’t tell you exactly what they look when they scan your apps, but in my view it would be a combination of many regular and behavioralÂ Security ScanÂ techniques.

They’ve also implemented boundary limits to services running in the background, which saves battery power.

Kotlin

KotlinÂ support was one completely unexpected announcement from Google. Kotlin is a JVM-based open source programming language, and it’s under active development.

But I would quickly highlight some main benefits:

Android Studio support
It’s interchangeable withÂ Java
More succinct code

You can join the official Kotlin Slack group to learn more about KotlinÂ here.

Android Go is yet another initiative aimed at improving Android’s presence in developing countries, where connectivity and data is a problem. Here’s theirÂ Youtube Go App.

Immersive Computing (VR/AR)

Basically, computing that works like more we do. This is a niche that is just picking up, and Google does not want to be a late comer. They introduced Daydream last year, and this time they’re already taking giant strides to solve crucial problems.

A standalone VR headset is an important achievement in the VR space because bothÂ ViveÂ andÂ OculusÂ need extra hardware to get the device working.

With Augmented Reality, your GPS can get your exact location. Then the Visual Positioning Service (VPS) can get you to the exact location of a home within a community.

Google expeditionsÂ can take students on immersive virtual journeys.

All Done! That’s a wrap.

Generally, we hear major product announcements from Google I/O. But this year there are no such announcements. Instead, there were a slew of updates and new features, and almost all products were enticing. Over and above, it shows that Google believes AI is the way forward.

Extra Resources

About me

I’m an engineer working on aÂ Cloud Security ProductÂ atÂ McAfee LLC. I’m passionate about evangelizing tech, meeting developers and helping in solving their problems.

You can learn more about meÂ here.

Originally published on Medium. Thanks Quincy.

Getting started with Google Cloud SDK

Aravind Putrevu — Thu, 04 May 2017 17:29:06 +0000

Hello, Peeps!

Recently I spoke about Google Cloud SDK in Google Cloud NEXT Extended event at Bangalore. I thought I should write an article about it to give you a fundamental understanding of Google Cloud SDK.

What ?

Google Cloud SDK is a set of tools for accessing Google public cloud platform in a secure way. It contains essential tools for maintaining, managing and monitoring Google Cloud Platform (GCP).

It is equivalent of Amazon Web Services CLI.

Why ?

Essentially many of the enterprises would automate monotonous mundane tasks to manage their infrastructure. Tools like these help in that process.

How ?

Installation

To start working with Cloud SDK, you need to install tools from here depending on your OS platform. One prerequisite is to have Python version 2.7.x or lesser
I recommend you to install the Python bundle coming along with the installer. Because the installer is not compatible with Python version 3.x and above, you can check about the same in Google Issue tracker.

Configuring your account

The first thing to do after successful installation is open your command line and type gcloud to check whether Cloud SDK has installed perfectly.
Run gcloud init, it opens up a new browser window and asks to login into your google cloud account. Come back to command line and select your options like the project, region/zone etc.

Tools and Commands

Cloud SDK comes along with a set of command tools

gcloud : works with Google Compute Engine (equivalent of AWS EC2)
bq : works with Cloud Bigquery (can be compared to Amazon Redshift)
gsutil : works with Cloud Storage (equivalent of AWS S3/AWS EBS)

You can list/install/update these tools by using below command

gcloud components list

Scripting using gcloud commands

Use case: Spin up a VM in Cloud Compute engine and run a startup script to copy an image to a Cloud Storage Bucket.

Using gcloud and gsutil commands you can write a Python script. I tried one particular use case by following Google Cloud tutorial. I could not find the same tutorial and I uploaded the code used onto Github.

list-instance.py: will list the instance
create-instance.py: creates an instance
startup-script.sh: Runs after the VM boots up

Cloud API Client Libraries

If you are a programmer and like working with some programming language, Google Cloud has given another way to use GCP.

Google Cloud comes with 7 Client API libraries. You can build a nice dashboard for your AWS and GCP cloud workloads.

There are exhaustive documentation and code samples for API Client Libraries in each language. Here is the link for Java Samples.

If none of the sample provided has the service implementation for a particular API, you can write your own implementation by using Google APIs

Security and Privacy Considerations

Last but not the least, Using Google Cloud SDK on your local machine might create some doubts to you in the security perspective. gsutil ensures all the data transferred via TLS protocol to prevent any data leakage.

There are a ton of measures taken to ensure security while running these commands in your local/dev machine. However, in this article GCP explains would handle security risks.

If you are still not satisfied, without installing any command line tool, you can open a Cloud Dev Shell via GCP Console.

I hope I gave a primer of the Cloud SDK. But if you still need more details then head over to this link OR drop a comment.

Thank you !

Hi, I'm Aravind Putrevu

Aravind Putrevu — Sun, 30 Apr 2017 10:15:06 +0000

I have been coding for 9 years.

You can find me on GitHub as aravindputrevu
and twitter as aravindputrevu

I live in Bengaluru, Silicon valley of India.

I work for Intel

I mostly talk to my Computer in : Java,JavaScript,Python.

I am currently learning more about ethics and values in Programming [OR] why we are coding !

Nice to meet you.