Showing posts with label Artificial Intelligence (AI). Show all posts
Showing posts with label Artificial Intelligence (AI). Show all posts

Thursday, 10 October 2024

Building an AI-Native Security Operations Center: Revolutionizing Your Cyber Defense

Building an AI-Native Security Operations Center: Revolutionizing Your Cyber Defense

In today’s fast-paced digital world, cyber threats are evolving at an unprecedented rate. For business leaders, safeguarding their organization’s digital assets isn’t just a technical challenge—it’s a strategic imperative. An AI-native Security Operations Center (SOC) represents a transformative leap in cybersecurity, providing the agility, intelligence, and resilience necessary to protect against sophisticated attacks. This blog explores the strategic advantages of an AI-native SOC and outlines a pathway for leaders to embrace this innovation.

Why an AI-Native SOC is a Strategic Game Changer

Traditional SOCs often struggle to keep pace with the volume and complexity of modern cyber threats. An AI-native SOC leverages artificial intelligence to not only detect but also predict and respond to threats in real time. This ensures that your security operations remain ahead of adversaries, providing enhanced protection and futureproofing your security defences.

By handling routine monitoring and initial threat analysis, AI optimizes your security investments, allowing human analysts to focus on more complex, value-driven tasks. This maximizes the impact of your cybersecurity talent and budget while empowering leaders to accelerate decision-making processes, by providing actionable insights faster than traditional methods, which is crucial in mitigating the impact of security incidents.

Expanding the Vision: The Pillars of an AI-Native SOC

The foundation of an AI-native SOC rests on several key components:

1. Holistic Data Integration is not merely a technical necessity, within an AI-native SOC, it is the bedrock upon which effective security operations are built. The goal is to create a single source of truth that provides a comprehensive view of the organization’s security landscape. This is achieved by creating a unified data platform that aggregates and consolidates information from network traffic, endpoint logs, user activity, external threat intelligence, and more, into a centralized repository.The challenges of data integration, though, are manifold and must be addressed before any meaningful progress can be made towards an AI-native SOC as AI algorithms depend on accurate data to make reliable predictions. Data from disparate sources can be inconsistent, incomplete, or in different formats. Overcoming these challenges to ensure data quality and consistency requires robust data normalization processes and seamless whole-system integration.

Existing security infrastructure, such as SIEMs (Security Information and Event Management), XDR (eXtended Detection and Response), SOAR (Security Orchestration, Automation, and Response), firewalls, and IDS/IPS (Intrusion Detection Systems/Intrusion Prevention Systems), as well as network infrastructure from the data centre to internal networks, routers, and switches capable of capturing NetFlow, for example, must work in harmony with the new AI tools. This can involve secure engineering (SecDevOps) efforts to develop custom connectors or to leverage middleware solutions that facilitate data exchange between systems.

2. Smart Automation and Orchestration are crucial for an AI-native SOC to operate efficiency. Automated response mechanisms can swiftly and accurately handle routine incident responses, such as isolating compromised systems or blocking malicious IP addresses. While orchestration platforms synchronize those responses across various security tools and teams, ensuring a cohesive and effective defence.To confidently reduce the workload on human analysts and minimize the potential for human error, it is critical to develop comprehensive and intelligent playbooks to define automated actions for various types of incidents.

For example, if a malware infection is reported via integrated threat intelligence feeds, the playbook might specify steps to first scan for the IoCs (indicators of compromise), isolate any affected endpoint, scan for other infections, and initiate remediation processes. These actions are executed automatically, without the need for manual intervention. And because you have already seamlessly integrated your security and network solutions when an incident is detected, your orchestration platform coordinates responses across your architecture ensuring that all relevant tools and teams are alerted, and appropriate actions taken at machine speed.

3. Human-AI Synergy enhances decision-making. Security analysts benefit from AI-driven insights and recommendations, which augment their ability to make strategic decisions. While AI and automation are powerful, human expertise remains indispensable in the SOC. The goal of an AI-native SOC is not to replace human analysts but to augment their capabilities.For example, when an anomaly is detected, AI can provide context by correlating it with historical data and known threat intelligence. This helps analysts quickly understand the significance of the anomaly and determine the appropriate response.

Continuous learning systems are another vital component. These systems learn from analyst feedback and real-world incidents to improve their performance over time. For instance, if an analyst identifies a false positive, this information is fed back into the AI model, which adjusts its algorithms to reduce similar false positives in the future. This iterative process ensures that the AI system continually evolves and adapts to new threats.

4. Advanced AI and Machine Learning Algorithms drive the AI-native SOC’s capabilities. Through proactive anomaly detection, predictive threat intelligence and behavioral analytics these technologies transform raw data into actionable intelligence, enabling the AI-native SOC to detect and respond to threats with unprecedented speed and accuracy.Proactive anomaly detection is one of the primary functions of AI in the SOC. Using unsupervised learning techniques, AI can analyze vast amounts of data to establish baselines of normal behavior. Any deviation from these baselines is flagged as a potential anomaly, prompting further investigation. This capability is particularly valuable for identifying zero-day attacks and advanced persistent threats (APTs), which often evade traditional detection methods.

Predictive threat intelligence is another critical application. Supervised learning models are trained on historical data to recognize patterns associated with known threats. These models can then predict future threats based on similar patterns. For instance, if a specific sequence of events has historically led to a ransomware attack, the AI can alert security teams to take preventive measures when similar patterns are detected.

Behavioral analytics add another layer of sophistication. By analyzing the behavior of users and entities within the network, AI can detect insider threats, compromised accounts, and other malicious activities that might not trigger traditional alarms. Behavioral analytics rely on both supervised and unsupervised learning techniques to identify deviations from normal behavior patterns.

5. Ongoing Monitoring and Adaptation ensure that the AI-native SOC remains effective. The dynamic nature of cyber threats necessitates continuous monitoring and adaptation. Real-time threat monitoring involves using AI to analyze data streams as they are generated. This allows the SOC to identify and respond to threats immediately, reducing vital KPIs of MTTA, MTTD, and MTTR. Adaptive AI models play a crucial role in this process. These models continuously learn from new data and incidents, adjusting their algorithms to stay ahead of emerging threats.Feedback mechanisms are essential for maintaining the effectiveness of the SOC. After each incident, a post-incident review is conducted to assess the response and identify areas for improvement. The insights gained from these reviews are used to refine AI models and response playbooks, ensuring that the SOC becomes more robust with each incident. 

Implementing Your AI-Native SOC: A Strategic Approach

Successfully implementing an AI-native SOC requires a strategic approach that aligns with your organization’s broader business goals. The following steps outline a comprehensive roadmap for this transformation:

Evaluate Your Current Landscape

Begin by conducting a thorough assessment of your current security operations. Identify existing strengths and weaknesses, and pinpoint areas where AI can provide the most significant benefits. This assessment should consider your existing infrastructure, data sources, and the current capabilities of your security team.

Define Strategic Objectives

Clearly define the strategic objectives for your AI-native SOC initiative. These objectives should align with your organization’s broader business goals and address specific security challenges. For example, your objectives might include reducing response times, improving threat detection accuracy, or optimizing resource allocation.

Select and Integrate Advanced Technologies

Choosing the right technologies is critical for the success of your AI-native SOC. Select AI and automation solutions that complement your existing infrastructure and offer seamless integration. This might involve working with vendors to develop custom solutions or leveraging open-source tools that can be tailored to your needs.

Build a Forward-Thinking Team

Assemble a multidisciplinary team with expertise in AI, cybersecurity, and data science. This team will be responsible for developing, implementing, and managing your AI-native SOC. Invest in ongoing training to ensure that your team remains at the forefront of technological advancements.

Pilot and Scale

Start with pilot projects to test and refine your AI models in controlled environments. These pilots should focus on specific use cases that offer the greatest potential for impact. Use the insights gained from these pilots to scale your AI-native SOC across the organization, addressing any challenges that arise during the scaling process.

Monitor, Learn, and Evolve

Continuously monitor the performance of your AI-native SOC, learning from each incident to adapt and improve. Establish feedback loops that allow your AI models to learn from real-world incidents and analyst feedback. Foster a culture of continuous improvement to ensure that your SOC remains effective in the face of evolving threats.

Overcoming Challenges

Implementing an AI-native SOC is not without challenges. Data privacy and compliance must be ensured, balancing security with privacy concerns. This involves implementing robust data protection measures and ensuring that your AI systems comply with relevant regulations.

Managing false positives is another significant challenge. AI models must be continuously refined to minimize false positives, which can erode trust in the system and waste valuable resources. This requires a careful balance between sensitivity and specificity in threat detection.

The integration process can be complex, particularly when dealing with legacy systems and diverse data sources. Thoughtful planning and expert guidance can help navigate these challenges effectively. This might involve developing custom connectors, leveraging middleware solutions, or working with vendors to ensure seamless integration.

Conclusion

For business leaders, building an AI-native SOC is more than a technological upgrade, it’s a strategic investment in the future security and resilience of your organization. By embracing AI-native security operations, you can transform your approach to Cyber Defense, safeguarding your assets, optimizing resources, and staying ahead of emerging threats. The journey to an AI-native SOC involves challenges, but with the right strategy and commitment, the rewards are substantial and enduring.

Transform your cyber defence strategy today. The future is AI-native, and the future is now.

Saturday, 21 September 2024

Putting AI Into AIOps: A Future Beyond Dashboards

Putting AI Into AIOps: A Future Beyond Dashboards

In today’s fast-paced IT environment, traditional dashboards and reactive alert systems are quickly becoming outdated. The digital landscape requires a more proactive and intelligent approach to IT operations. Enter Artificial Intelligence (AI) in IT Operations (AIOps), a transformative approach that leverages AI to turn data into actionable insights, automated responses, and enabling self-healing systems. This shift isn’t just integrating AI into existing frameworks; it has the potential to fundamentally transform IT operations.

The Evolution of IT Operations: From Reactive to Proactive


Putting AI Into AIOps: A Future Beyond Dashboards
The traditional model of IT operations has long been centered around dashboards, manual interventions, and reactive processes. What once sufficed in simpler systems is now inadequate in today’s complex, interconnected environments. Today’s systems produce vast data of logs, metrics, events, and alerts, creating overwhelming noise that hides critical issues. It’s like searching for a whisper in a roaring crowd. The main challenge isn’t the lack of data, but the difficulty in extracting timely, actionable insights.

AIOps steps in by addressing this very challenge, offering a path to shift from reactive incident management to proactive operational intelligence. The introduction of a robust AIOps maturity model allows organizations to progress from basic automation and predictive analytics to advanced AI techniques, such as generative and multimodal AI. This evolution allows IT operations to become insight-driven, continuously improving, and ultimately self-sustaining. What if your car could not only drive itself and learn from every trip, but also only alert you when critical action was needed, cutting through the noise and allowing you to focus solely on the most important decisions?

Leveraging LLMs to Augment Operations


A key advancement in AIOps is the integration of Large Language Models (LLMs) to support IT teams. LLMs process and respond in natural language to enhance decision-making by offering troubleshooting suggestions, identifying root causes, and proposing next steps, seamlessly collaborating with the human operators.

When problems occur in IT operations, teams often lose crucial time manually sifting through logs, metrics, and alerts to diagnose the problem. It’s like searching for a needle in a haystack; we waste valuable time digging through endless data before we can even begin solving the real issue. With LLMs integrated into the AIOps platform, the system can instantly analyze large volumes of unstructured data, such as incident reports and historical logs, and suggest the most probable root causes. LLMs can quickly recommend the right service group for an issue using context and past incident data, speeding up ticket assignment and resulting in quicker user resolution.

LLMs can also offer recommended next steps for remediation based on best practices and past incidents, speeding up resolution and helping less experienced team members make informed decisions, boosting overall team competence. It’s like having a seasoned mentor by your side, guiding you with expert advice for every step. Even beginners can quickly solve problems with confidence, improving the whole team’s performance.

Revolutionizing Incident Management in Global Finance Use Case


In the global finance industry, seamless IT operations are essential for ensuring reliable and secure financial transactions. System downtimes or failures can lead to major financial losses, regulatory fines, and damaged customer trust. Traditionally, IT teams used a mix of monitoring tools and manual analysis to address issues, but this often causes delays, missed alerts, and a backlog of unresolved incidents. It’s like managing a train network with outdated signals as everything slows down to avoid mistakes, but delays still lead to costly problems. Similarly, traditional IT incident management in finance slows responses, risking system failures and trust.

IT Operations Challenge

A major global financial institution is struggling with frequent system outages and transaction delays. Its traditional operations model relies on multiple monitoring tools and dashboards, causing slow response times, a high Mean Time to Repair (MTTR), and an overwhelming number of false alerts that burden the operations team. The institution urgently needs a solution that can detect and diagnose issues more quickly while also predicting and preventing problems before they disrupt financial transactions.

AIOps Implementation

The institution implements an AIOps platform that consolidates data from multiple sources, such as transaction logs, network metrics, events, and configuration management databases (CMDBs). Using machine learning, the platform establishes a baseline for normal system behavior and applies advanced techniques like temporal proximity filtering and collaborative filtering to detect anomalies. These anomalies, which would typically be lost in the overwhelming data noise, are then correlated through association models to accurately identify the root causes of issues, streamlining the detection and diagnosis process.

Putting AI Into AIOps: A Future Beyond Dashboards
To enhance incident management, the AIOps platform integrates a Large Language Model (LLM) to strengthen the operations team’s capabilities. When a transaction delay occurs, the LLM quickly analyzes unstructured data from historical logs and recent incident reports to identify likely causes, such as a recent network configuration change or a database performance issue. Based on patterns from similar incidents, it determines which service group should take ownership, streamlining ticket assignment and accelerating issue resolution, ultimately reducing Mean Time to Repair (MTTR).

Results

  • Reduced MTTR and MTTA: The financial institution experiences a significant reduction in Mean Time to Repair (MTTR) and Mean Time to Acknowledge (MTTA), as issues are identified and addressed much faster with AIOps. The LLM-driven insights allow the operations team to bypass initial diagnostic steps, leading directly to effective resolutions.
  • Proactive Issue Prevention: By leveraging predictive analytics, the platform can forecast potential issues, allowing the institution to take preventive measures. For example, if a trend suggests a potential future system bottleneck, the platform can automatically reroute transactions or notify the operations team to perform preemptive maintenance.
  • Enhanced Workforce Efficiency: The integration of LLMs into the AIOps platform enhances the efficiency and decision-making capabilities of the operations team. By providing dynamic suggestions and troubleshooting steps, LLMs empower even the less experienced team members to handle complex incidents with confidence, improving the user experience.
  • Reduced Alert Fatigue: LLMs help filter out false positives and irrelevant alerts, reducing the burden of noise that overwhelms the operations team. By focusing attention on critical issues, the team can work more effectively without being bogged down by unnecessary alerts.
  • Improved Decision-Making: With access to data-driven insights and recommendations, the operations team can make more informed decisions. LLMs analyze vast amounts of data, drawing on historical patterns to offer guidance that would be difficult to obtain manually.
  • Scalability: As the financial institution grows, AIOps and LLMs scale seamlessly, handling increasing data volumes and complexity without sacrificing performance. This ensures that the platform remains effective as operations expand.

Moving Past Incident Management


The use case shows how AIOps, enhanced by LLMs, can revolutionize incident management in finance, but its potential applies across industries. With a strong maturity model, organizations can achieve excellence in monitoring, security, and compliance. Supervised learning optimizes anomaly detection and reduces false positives, while generative AI and LLMs analyze unstructured data, offering deeper insights and advanced automation.

By focusing on high-impact areas such as reducing resolution times and automating tasks, businesses can rapidly gain value from AIOps. The aim is to build a fully autonomous IT environment that self-heals, evolves, and adapts to new challenges in real time much like a car that not only drives itself but learns from each trip, optimizing performance and solving issues before they arise.

Conclusion

“Putting AI into AIOps” isn’t just a catchy phrase – it’s a call to action for the future of IT operations. In a world where the pace of change is relentless, merely keeping up or treading water isn’t enough; Organizations must leap ahead to become proactive. AIOps is the key, transforming vast data into actionable insights and moving beyond traditional dashboards.

This isn’t about minor improvements, it’s a fundamental shift. Imagine a world where issues are predicted and resolved before they cause disruption, where AI helps your team make smarter, faster decisions, and operational excellence becomes standard. The global finance example shows real benefits; reduced risks, lower costs, and a seamless user experience.

Those who embrace AI-driven AIOps will lead the way, redefining success in the digital era. The era of intelligent, AI-powered operations is here. Are you ready to lead the charge?

Source: cisco.com

Wednesday, 21 August 2024

The AI Revolution: Transforming Technology and Reshaping Cybersecurity


Artificial Intelligence (AI) is revolutionizing government and technology, driving an urgent need for innovation across all operations. Although historically, local and state government systems have seen only incremental changes with limited AI adoption, today, a significant shift is occurring as AI is integrated across all government sectors.

Benefits of AI Integration


The benefits of these changes are evident. AI-powered systems analyze vast amounts of data, offering insights for better decision-making. Public services become more personalized and efficient, reducing wait times and enhancing citizen satisfaction. Security is significantly bolstered through AI-driven threat detection and response. Consequently, governments are adopting AI and advanced software applications to provide secure, reliable, and resilient services to their citizens, enhancing digital engagement and communication within their communities.

With this rapid growth, cybersecurity operations are among the areas most significantly impacted by advancements in artificial intelligence. CyberOps is at a unique intersection, needing to leverage advanced AI capabilities to enhance effectiveness and resiliency. However, numerous applications and connections are simultaneously challenging it by utilizing emerging AI capabilities to improve their effectiveness and resilience. Despite historically being rigid and resistant to change, CyberOps must adapt to the challenges of an AI-driven digital world.

Whole-of-State / Agency Cybersecurity Approach


Whole-of-State cybersecurity and zero trust governments can be challenged with maintaining digital operations while ensuring sensitive information’s privacy and security. Cisco’s technology allowed agencies to easily meet these requirements through advanced AI-powered security solutions and privacy-preserving AI models. Thanks to techniques like federated learning and differential privacy, sensitive information could be processed and analyzed without compromising individual privacy.

The AI Revolution: Transforming Technology and Reshaping Cybersecurity

Adopting AI-Driven Services


Adopting AI-driven, easily consumable, on-demand services provides a secure, sustainable, and reliable foundation to build on. Investing in an infrastructure that is secure and flexible allows governments to quickly pivot to the emerging opportunities that the AI revolution brings. No one person could have predicted or prepared for such a transformative shift. Still, the ability to rapidly adapt to the challenges it brought and continue to serve the community and citizens in the ways they deserve is key.

Challenges and Adaptation


Don’t be mistaken, change is often hard. Humans are creatures of habit and comfort and rarely like to be pushed outside our comfort zone. Unfortunately, the AI revolution is doing just that. It is forcing us to adapt and discover new ways to operate and provide what are now seen as even the most basic digital services. The drive and demand for AI-powered services in the government sector are rapidly expanding. We are experiencing one of the most significant catalysts for technological adoption in the state and local government space since the internet became mainstream.

This revolution is driving the necessity for a whole-of-state cybersecurity and zero trust approach. The goal is no longer maintaining the status quo but rather achieving a level of service that provides the foundation for how things can be in an AI-enabled world. Providing enhanced, secure services and support to the community has become the resounding focus of state and local governments.

Cisco’s Role in Supporting Governments


As we navigate this AI revolution, Cisco stands ready to support governments in their journey towards whole-of-state cybersecurity and zero trust adoption. Our comprehensive suite of AI-powered solutions provides the building blocks for a secure and efficient AI-enabled government infrastructure. The shift to a more inclusive, AI-driven government began with specific applications but is rapidly expanding to all sectors and offerings in the state and local government spaces.

Source: cisco.com

Saturday, 10 August 2024

Optimizing AI Workloads with NVIDIA GPUs, Time Slicing, and Karpenter

Maximizing GPU efficiency in your Kubernetes environment


In this article, we will explore how to deploy GPU-based workloads in an EKS cluster using the Nvidia Device Plugin, and ensuring efficient GPU utilization through features like Time Slicing. We will also discuss setting up node-level autoscaling to optimize GPU resources with solutions like Karpenter. By implementing these strategies, you can maximize GPU efficiency and scalability in your Kubernetes environment.

Additionally, we will delve into practical configurations for integrating Karpenter with an EKS cluster, and discuss best practices for balancing GPU workloads. This approach will help in dynamically adjusting resources based on demand, leading to cost-effective and high-performance GPU management. The diagram below illustrates an EKS cluster with CPU and GPU-based node groups, along with the implementation of Time Slicing and Karpenter functionalities. Let’s discuss each item in detail.

Optimizing AI Workloads with NVIDIA GPUs, Time Slicing, and Karpenter

Basics of GPU and LLM


A Graphics Processing Unit (GPU) was originally designed to accelerate image processing tasks. However, due to its parallel processing capabilities, it can handle numerous tasks concurrently. This versatility has expanded its use beyond graphics, making it highly effective for applications in Machine Learning and Artificial Intelligence.

Optimizing AI Workloads with NVIDIA GPUs, Time Slicing, and Karpenter

When a process is launched on GPU-based instances these are the steps involved at the OS and hardware level:

  • Shell interprets the command and creates a new process using fork (create new process) and exec (Replace the process’s memory space with a new program) system calls.
  • Allocate memory for the input data and the results using cudaMalloc(memory is allocated in the GPU’s VRAM)
  • Process interacts with GPU Driver to initialize the GPU context here GPU driver manages resources including memory, compute units and scheduling
  • Data is transferred from CPU memory to GPU memory
  • Then the process instructs GPU to start computations using CUDA kernels and the GPU schedular manages the execution of the tasks
  • CPU waits for the GPU to finish its task, and the results are transferred back to the CPU for further processing or output.
  • GPU memory is freed, and GPU context gets destroyed and all resources are released. The process exits as well, and the OS reclaims the resource

Compared to a CPU which executes instructions in sequence, GPUs process the instructions simultaneously. GPUs are also more optimized for high performance computing because they don’t have the overhead a CPU has, like handling interrupts and virtual memory that is necessary to run an operating system. GPUs were never designed to run an OS, and thus their processing is more specialized and faster.

Optimizing AI Workloads with NVIDIA GPUs, Time Slicing, and Karpenter

Large Language Models


A Large Language Model refers to:

  • “Large”: Large Refers to the model’s extensive parameters and data volume with which it is trained on
  • “Language”: Model can understand and generate human language
  • “Model”: Model refers to neural networks

Optimizing AI Workloads with NVIDIA GPUs, Time Slicing, and Karpenter

Run LLM Model


Ollama is the tool to run open-source Large Language Models and can be download here https://ollama.com/download

Pull the example model llama3:8b using ollama cli

ollama -h
Large language model runner
Usage:
  ollama [flags]
  ollama [command]
Available Commands:
  serve Start ollama
  create Create a model from a Modelfile
  show Show information for a model
  run Run a model
  pull Pull a model from a registry
  push Push a model to a registry
  list List models
  ps List running models
  cp Copy a model
  rm Remove a model
  help Help about any command
Flags:
  -h, --help help for ollama
  -v, --version Show version information
Use "ollama [command] --help" for more information about a command.

ollama pull llama3:8b: Pull the model


ollama pull llama3:8b
pulling manifest 
pulling 6a0746a1ec1a... 100% ▕██████████████████████████████████████████████████████████████████ 4.7 GB 
pulling 4fa551d4f938... 100% ▕██████████████████████████████████████████████████████████████████ 12 KB 
pulling 8ab4849b038c... 100% ▕██████████████████████████████████████████████████████████████████ 254 B 
pulling 577073ffcc6c... 100% ▕██████████████████████████████████████████████████████████████████ 110 B 
pulling 3f8eb4da87fa... 100% ▕██████████████████████████████████████████████████████████████████ 485 B 
verifying sha256 digest 
writing manifest 
removing any unused layers 
success

ollama list: List the models


developer:src > ollama show llama3:8b
  Model 
        arch llama 
        parameters 8.0B 
        quantization Q4_0 
        context length 8192 
        embedding length 4096 

  Parameters 
        num_keep 24 
        stop "<|start_header_id|>" 
        stop "<|end_header_id|>" 
        stop "<|eot_id|>" 

  License 
        META LLAMA 3 COMMUNITY LICENSE AGREEMENT 
        Meta Llama 3 Version Release Date: April 18, 2024

ollama run llama3:8b: Run the model


developer:src > ollama run llama3:8b
>>> print all primes between 1 and n
Here is a Python solution that prints all prime numbers between 1 and `n`:
```Python
def print_primes(n):
    for possiblePrime in range(2, n + 1):
        # Assume number is prime until shown it is not. 
        isPrime = True
        for num in range(2, int(possiblePrime ** 0.5) + 1):
            if possiblePrime % num == 0:
                isPrime = False
                break
        if isPrime:
            print(possiblePrime)
n = int(input("Enter the number: "))
print_primes(n)
```
In this code, we loop through all numbers from `2` to `n`. For each number, we assume it's prime and then check if it has any 
divisors other than `1` and itself. If it does, then it's not a prime number. If it doesn't have any divisors, then it is a 
prime number.
The reason why we only need to check up to the square root of the number is because a larger factor of the number would be a 
multiple of smaller factor that has already been checked.
Please note that this code might take some time for large values of `n` because it's not very efficient. There are more 
efficient algorithms to find prime numbers, but they are also more complex.

Source: cisco.com

Thursday, 27 June 2024

Cisco API Documentations Is Now Adapted for Gen AI Technologies

Developer experience changes rapidly. Many developers and the Cisco DevNet community utilize Generative AI tools and language models for code generation and troubleshooting.

Better data = better model completion

The main challenge for GenAI users is finding valid data for their prompts or Vector Databases. Developers and engineers need to care about the data they plan to use for LLMs/GenAI interaction.

OpenAPI documentations is now available to download


The OpenAPI documentation is a specification that defines a standard way to describe RESTful APIs, including endpoints, parameters, request/response formats, and authentication methods, promoting interoperability and ease of integration.

We at Cisco DevNet care about developers’ experience and want to make your experience working with Cisco APIs efficient and with minimal development/testing costs.

You can find links to OpenAPI documentation in JSON/YAML format here: Open API Documentation page and Search related product API – Navigate to API Reference -> Overview section in left-side menu

Note: Some API documentation can contain multiple OpenAPI Documents

For which purpose you can use related OpenAPI documentation as a part of prompt/RAG:

  • Construct code or script that utilizes related Cisco API
  • Find related API operations or ask to fix existing code using the information in the API documentation
  • Create integrations with Cisco products through API
  • Create and test AI agents
  • Utilize related Cisco OpenAPI documentation locally or using approved AI tools in your organization.

Structured vs Unstructured data


I’ve compared two LLM model completions with a prompt that contains two parts. The first part of the prompt was the same and contained the following information:

Based on the following API documentation, please write step-by-step instructions that can help automatically tag roaming computers using Umbrella API.
High-level workflow description:

  1. Add API Key
  2. Generate OAuth 2.0 access token
  3. Create tag
  4. Get the list of roaming computers and identify related ‘originId’
  5. Add tag to devices.

API documentation:

Second part:

In one case, it contains copy and paste data directly from the doc,
The other one contains LLM-friendly structured data like OpenAPI documents pasted one by one

Cisco API Documentations Is Now Adapted for Gen AI Technologies
Part of CDO OpenAPI documentation

Cisco API Documentations Is Now Adapted for Gen AI Technologies
Claude 3 Sonnet model completion. Prompt with OpenAPI documents 

Cisco API Documentations Is Now Adapted for Gen AI Technologies
Claude 3 Sonnet model completion. Prompt with copy and paste data

Benefits of using LLM-friendly documentation as a part of the prompt


I’ve found that model output was more accurate when we used OpenAPI documents as a part of a prompt. API endpoints provided in each step were more accurate. Recommendations in sections like “Get List of Roaming Computers” contain better and more optimal instructions and API operations.

Source: cisco.com

Saturday, 8 June 2024

Cisco AI Assistant for Managing Firewall Policies Is Now Available

Cisco AI Assistant is now available for Cisco XDR and Cisco Defense Orchestrator


Managing firewall policies and locating relevant documentation can be daunting for firewall administrators. However, the AI Assistant integrated with the Cisco Defense Orchestrator (CDO) and the cloud-delivered Firewall Management Center simplifies these processes. With this powerful combination, administrators can effortlessly manage firewall devices, configure policies, and access reference materials whenever required, streamlining their workflow and boosting overall efficiency.

Prerequisites


Administrators need to ensure they have met the following prerequisites to use the AI Assistant:

User roles:

● CDO and cloud-delivered Firewall Management Center – Super Admin or Admin
● On-Prem FMC – Global Domain Admin

Upon successful login into your tenant, you will notice an AI Assistant button positioned in the top menu bar of the dashboard.

Cisco AI Assistant for Managing Firewall Policies Is Now Available

Click the AI Assistant button on the CDO or cloud-delivered Firewall Management Center home page to access the AI Assistant.

The Cisco AI Assistant interface contains the following components: Text Input Box, New Chat, Chat History, Expand View, and Feedback.

Cisco AI Assistant for Managing Firewall Policies Is Now Available

Cisco AI Assistant interface following the best Generative AI assistant practices.

AI Assistant interaction


AI Assistant completion with the prompt “Can you provide me with the distinct IP addresses that are currently blocked by our firewall policies?”

Cisco AI Assistant for Managing Firewall Policies Is Now Available

AI Assistant completion with the prompt “What access control rules are disabled?”

Cisco AI Assistant for Managing Firewall Policies Is Now Available

If you think that response is wrong, please click the thumbs-down button below for the related completion and fill out and submit the form.

Cisco AI Assistant for Managing Firewall Policies Is Now Available

AI Assistant can’t proceed with some prompts and questions. In this case, you can see the following completion:

Cisco AI Assistant for Managing Firewall Policies Is Now Available

It looks like the engineering team decided not to display answers if there is insufficient data to correct them or in cases where the model can hallucinate.

Source: cisco.com

Thursday, 2 May 2024

Computing that’s purpose-built for a more energy-efficient, AI-driven future

Computing that’s purpose-built for a more energy-efficient, AI-driven future

Just as humans use patterns as mental shortcuts for solving complex problems, AI is about recognizing patterns to distill actionable insights. Now think about how this applies to the data center, where patterns have developed over decades. You have cycles where we use software to solve problems, then hardware innovations enable new software to focus on the next problem. The pendulum swings back and forth repeatedly, with each swing representing a disruptive technology that changes and redefines how we get work done with our developers and with data center infrastructure and operations teams.

AI is clearly the latest pendulum swing and disruptive technology that requires advancements in both hardware and software. GPUs are all the rage today due to the public debut of ChatGPT – but GPUs have been around for a long time. I was a GPU user back in the 1990s because these powerful chips enabled me to play 3D games that required fast processing to calculate things like where all those polygons should be in space, updating visuals fast with each frame.

In technical terms, GPUs can process many parallel floating-point operations faster than standard CPUs and in large part that is their superpower. It’s worth noting that many AI workloads can be optimized to run on a high-performance CPU.  But unlike the CPU, GPUs are free from the responsibility of making all the other subsystems within compute work with each other. Software developers and data scientists can leverage software like CUDA and its development tools to harness the power of GPUs and use all that parallel processing capability to solve some of the world’s most complex problems.

A new way to look at your AI needs


Unlike single, heterogenous infrastructure use cases like virtualization, there are multiple patterns within AI that come with different infrastructure needs in the data center. Organizations can think about AI use cases in terms of three main buckets:

1. Build the model, for large foundational training.
2. Optimize the model, for fine-tuning a pre-trained model with specific data sets.
3. Use the model, for inferencing insights from new data.

The least demanding workloads are optimize and use the model because most of the work can be done in a single box with multiple GPUs. The most intensive, disruptive, and expensive workload is build the model. In general, if you’re looking to train these models at scale you need an environment that can support many GPUs across many servers, networking together for individual GPUs that behave as a single processing unit to solve highly complex problems, faster.

This makes the network critical for training use cases and introduces all kinds of challenges to data center infrastructure and operations, especially if the underlying facility was not built for AI from inception. And most organizations today are not looking to build new data centers.

Therefore, organizations building out their AI data center strategies will have to answer important questions like:

  • What AI use cases do you need to support, and based on the business outcomes you need to deliver, where do they fall into the build the model, optimize the model, and use the model buckets?
  • Where is the data you need, and where is the best location to enable these use cases to optimize outcomes and minimize the costs?
  • Do you need to deliver more power? Are your facilities able to cool these types of workloads with existing methods or do you require new methods like water cooling?
  • Finally, what is the impact on your organization’s sustainability goals?

The power of Cisco Compute solutions for AI


As the general manager and senior vice president for Cisco’s compute business, I’m happy to say that Cisco UCS servers are designed for demanding use cases like AI fine-tuning and inferencing, VDI, and many others. With its future-ready, highly modular architecture, Cisco UCS empowers our customers with a blend of high-performance CPUs, optional GPU acceleration, and software-defined automation. This translates to efficient resource allocation for diverse workloads and streamlined management through Cisco Intersight. You can say that with UCS, you get the muscle to power your creativity and the brains to optimize its use for groundbreaking AI use cases.

But Cisco is one player in a wide ecosystem. Technology and solution partners have long been a key to our success, and this is certainly no different in our strategy for AI. This strategy revolves around driving maximum customer value to harness the full long-term potential behind each partnership, which enables us to combine the best of compute and networking with the best tools in AI.

This is the case in our strategic partnerships with NVIDIA, Intel, AMD, Red Hat, and others. One key deliverable has been the steady stream of Cisco Validated Designs (CVDs) that provide pre-configured solution blueprints that simplify integrating AI workloads into existing IT infrastructure. CVDs eliminate the need for our customers to build their AI infrastructure from scratch. This translates to faster deployment times and reduced risks associated with complex infrastructure configurations and deployments.

Computing that’s purpose-built for a more energy-efficient, AI-driven future

Another key pillar of our AI computing strategy is offering customers a diversity of solution options that include standalone blade and rack-based servers, converged infrastructure, and hyperconverged infrastructure (HCI). These options enable customers to address a variety of use cases and deployment domains throughout their hybrid multicloud environments – from centralized data centers to edge end points. Here are just a couple of examples:

  • Converged infrastructures with partners like NetApp and Pure Storage offer a strong foundation for the full lifecycle of AI development from training AI models to day-to-day operations of AI workloads in production environments. For highly demanding AI use cases like scientific research or complex financial simulations, our converged infrastructures can be customized and upgraded to provide the scalability and flexibility needed to handle these computationally intensive workloads efficiently.
  • We also offer an HCI option through our strategic partnership with Nutanix that is well-suited for hybrid and multi-cloud environments through the cloud-native designs of Nutanix solutions. This allows our customers to seamlessly extend their AI workloads across on-premises infrastructure and public cloud resources, for optimal performance and cost efficiency. This solution is also ideal for edge deployments, where real-time data processing is crucial.

AI Infrastructure with sustainability in mind 


Cisco’s engineering teams are focused on embedding energy management, software and hardware sustainability, and business model transformation into everything we do. Together with energy optimization, these new innovations will have the potential to help more customers accelerate their sustainability goals.

Working in tandem with engineering teams across Cisco, Denise Lee leads Cisco’s Engineering Sustainability Office with a mission to deliver more sustainable products and solutions to our customers and partners. With electricity usage from data centers, AI, and the cryptocurrency sector potentially doubling by 2026, according to a recent International Energy Agency report, we are at a pivotal moment where AI, data centers, and energy efficiency must come together. AI data center ecosystems must be designed with sustainability in mind. Denise outlined the systems design thinking that highlights the opportunities for data center energy efficiency across performance, cooling, and power in her recent blog, Reimagine Your Data Center for Responsible AI Deployments.

Recognition for Cisco’s efforts have already begun. Cisco’s UCS X-series has received the Sustainable Product of the Year by SEAL Awards and an Energy Star rating from the U.S. Environmental Protection Agency. And Cisco continues to focus on critical features in our portfolio through agreement on product sustainability requirements to address the demands on data centers in the years ahead.

Look ahead to Cisco Live


We are just a couple of months away from Cisco Live US, our premier customer event and showcase for the many different and exciting innovations from Cisco and our technology and solution partners. We will be sharing many exciting Cisco Compute solutions for AI and other uses cases. Our Sustainability Zone will feature a virtual tour through a modernized Cisco data center where you can learn about Cisco compute technologies and their sustainability benefits. I’ll share more details in my next blog closer to the event.

Source: cisco.com

Thursday, 11 April 2024

Quantum Security and Networking are Emerging as Lifelines in Our Quantum-powered Future

Quantum Security and Networking are Emerging as Lifelines in Our Quantum-powered Future

A metamorphosis continues to take shape with the rise of Post-Quantum Cryptography, Quantum Key Distribution, and the brave new world of Quantum Networking.

In the ever-evolving landscape of technology, quantum computing stands out as a beacon of both promise and challenge. As we delve into the world of quantum networking and security, we find ourselves at the intersection of groundbreaking innovation and urgent necessity.

Cisco believes that quantum networking is not just an intriguing concept. It drives our research and investment strategy around quantum computing. We see it as a critical path forward because it holds the key to horizontally scaling systems, including quantum computing systems. Imagine a future where quantum computers collaborate seamlessly across vast distances, solving complex problems that were previously insurmountable.

However, before we can realize the promise of quantum networking, we need to address the elephant in the room – security. When quantum computers become reality, our classical cryptographic methods will face an existential threat. These powerful machines will potentially break today’s encryption algorithms in seconds. Our digital fortresses are vulnerable.

This opens the question of what will happen when quantum computers enter the scene. The issue lies in key exchanges. In classical systems, we rely on public key infrastructure (PKI) to securely exchange keys. This has served us well, ensuring confidentiality and integrity. But quantum computers, with their uncanny ability to factor large numbers efficiently, disrupt this equilibrium. Suddenly, our once-secure secrets hang in the balance.

Getting to the heart of the matter, imagine a scenario that persists even in our current era – the ominous concept of “store now, decrypt later”. Picture an adversary intercepting encrypted data today. Biding their time, they await the moment when quantum supremacy becomes reality.

When that day dawns, they unleash their quantum beast upon the stored information. Our sensitive communications, financial transactions, and personal data will suddenly be laid bare, retroactively vulnerable to the quantum onslaught.

Post-Quantum Cryptography is gaining momentum


Enter Post-Quantum Cryptography (PQC). Recognizing the urgency of the coming quantum moment, the National Institute of Standards and Technology (NIST) has been evaluating PQC proposals and is expected to release its final standards for quantum-resistant cryptographic algorithms later this year. These algorithms are designed to withstand quantum attacks and while not perfect, they are intended to fill the gap until quantum-safe solutions mature.

Apple’s iMessage is a compelling proof point. Last year, Apple made a decisive move by announcing its adoption of PQC algorithms for end-to-end encryption. This strategic shift underscores the industry’s recognition of the looming quantum threat, especially around “store now, decrypt later” attacks, and the need to swiftly respond.

In the year ahead, as we move closer to the post-quantum world, PQC will continue to gain momentum as a data security solution. Cisco’s Liz Centoni shared insight in her tech predictions for 2024, highlighting the accelerating adoption of PQC as a software-based approach that works with conventional systems to protect data from future quantum attacks.

PQC will be used by browsers, operating systems, and libraries, and innovators will experiment with integrating it into protocols such as SSL/TLS 1.3, which governs classic cryptography. PQC will likely find its way into enterprises of every size and sector as they seek to safeguard their sensitive data from the threats posed by quantum computers.

Quantum Key Distribution is the holy grail


Beyond PQC lies the holy grail of quantum cryptography, which is Quantum Key Distribution (QKD). Last year, we accurately predicted that QKD would become more widely used, particularly within cloud computing, data centers, autonomous vehicles, and consumer devices like smartphones.

Unlike classical key exchange methods, QKD capitalizes on the no-cloning property inherent in quantum states whereby information encoded on one qubit cannot be copied or duplicated to another because quantum states are fragile, affected by any and every action such as measuring the state. In practical terms, that means an eavesdropper can always be discovered due to a “read” causing the photon state to change.

Consider a scenario where two parties, Bank A and Bank B, want to communicate securely. They use QKD, where Bank A sends quantum states (like polarized photons) to Bank B which measures them without knowing the original state.

The measurements are then used to create a shared key, based on a randomly selected subset of the transmitted state (measurement bases) reconciled between the two parties through an authenticated and encrypted classical channel. Since the eavesdropper does not know the random subset, any attempt to measure the transmitted information will be detected due to a disturbance in the quantum states.

The beauty lies in the provably secure nature of QKD — quantum mechanics forbids perfect cloning, rendering interception futile. In this dance of particles and principles, QKD stands as a lighthouse of security, promising a future where quantum and classical work in tandem to safeguard us.

For instance, integrating QKD in 5G communication infrastructure is becoming increasingly important. With QKD, organizations will be able to better protect the privacy and authenticity of data transmitted over low-latency, high-speed networks, explicitly addressing the security demands of the 5G era.

Efforts to make QKD solutions more accessible and interoperable are accelerating in response to the demand for even more secure data transfer. This is leading to commercialization and standardization initiatives that are expected to make QKD solutions more user friendly and cost effective, ultimately driving widespread adoption across new applications and sectors.

As strides continue toward achieving quantum-secure messaging, among the first organizations to more broadly implement PQC will likely be those responsible for critical infrastructure and essential government suppliers. Large enterprises and other organizations will follow, also implementing these algorithms within the next few years.

Quantum networking on the horizon


Depending on the desired level of security and performance required, Centoni explained that QKD can be used as either an alternative or a complement to PQC and, in the future, will also leverage quantum networking. However, she acknowledges that it’s early days for quantum networks.

So far, researchers have not successfully achieved sustained quantum networking on a large scale, but major discoveries and advancements are happening. Companies like Cisco, alongside cutting-edge leaders across various industries, are pouring billions into unlocking the awesome potential of quantum networks.

“Quantum networking will see significant new research and investment by government and financial services,” said Centoni. She predicts that this will also include sectors with high demand for data security and the kinds of workloads that perform well with quantum computers.

Quantum networking relies on teleportation principles of quantum mechanics to transmit information between two or more quantum computers. This takes place by manipulating qubits whereby they “entangle” with one another and enable instantaneous transfer of quantum information across vast distances – even when there’s no physical connection between the computers.

In the not-so-distant future, perhaps 4 to 5 years or more, quantum networking will inexorably emerge as a potent force. With quantum networking, quantum computers will be able to collaborate and exchange information to tackle intricate problems that no single quantum computer could solve on its own.

By leveraging the quantum principles of teleportation and non-cloning, quantum networking protocols will facilitate fast, reliable – and perhaps even unconditional – secure information exchange. Potential applications of quantum networking go far beyond cryptography, as well, to turbocharging drug discovery, artificial intelligence (AI), and materials science.

Looking to the post-quantum future


Today, quantum computers are at a very similar stage that mainframes were in the 1960s. Back then, very few organizations could afford those machines, which could fill an entire room. While QKD is now in use as a means of provably secure communication, quantum networking remains mainly theoretical.

QKD is the next generation of quantum cryptography, a step beyond PQC which is not provably secure because of the lack of a proof of mathematical hardness for the cryptographic algorithms. Quantum networking should be thought of as first, a substrate needed for QKD, and then building out larger and larger compute islands – such as data centers and LAN, then WAN – analogous to how classical computers were connected to build distributed computing.

The big challenge now, like the past, is to create quantum computers that can be both reliably and affordably scaled up and put into the hands of corporate, government, and research entities. As such, distributed quantum computing will be the primary driver for quantum networks. We may even see the advent of the quantum cloud and the quantum internet – the metamorphic network of the future.

Quantum networking and security are not mere buzzwords. They are our lifelines in a quantum-powered future. As we race against time, we must embrace quantum technologies while fortifying our defenses. The ultimate payoff is a network that’s more secure than anything we’ve known before — a network where quantum and classical dance harmoniously, protecting our digital existence.

Source: cisco.com

Tuesday, 26 March 2024

GenAI will Transform B2B Interactions and Solutions in the Year Ahead with New Depth of Context and Control

GenAI will Transform B2B Interactions and Solutions in the Year Ahead with New Depth of Context and Control

Human-like interaction with B2B solutions, bespoke multimodal LLMs for better accuracy and precision, curated workflow automation via LAMs and customized B2B applications will become the norm as GenAI expands in the business sphere.

With the rapid launch of new solutions powered by generative AI (GenAI), the business-to-business (B2B) landscape is being reshaped in front of our eyes. Many organizations have taken a cautious and meticulously planned approach to widespread adoption of artificial intelligence (AI), however the Cisco AI Readiness Index reveals just how much pressure they are now feeling.

Adverse business impacts are anticipated by 61% of organizations if they have not implemented an AI strategy within the next year. In some cases, the window may even be narrower as competitors pull away, leaving very little time to properly execute plans. The clock is ticking, and the call for AI integration – especially GenAI – is now louder than ever.

In her predictions of tech trends for the new year, Chief Strategy Officer and GM of Applications, Liz Centoni said GenAI-powered Natural Language Interfaces (NLIs) will become the norm for new products and services. “NLIs powered by GenAI will be expected for new products and more than half will have this by default by the end of 2024.”

NLIs allow users to interact with applications and systems using normal language and spoken commands as with AI assistants, for instance, to instigate functionality and dig for deeper understanding. This capability will become available across most business-to-consumer (B2C) applications and services in 2024, especially for question-and-answer (Q&A) type of interactions between a human and a “machine”. However, associated B2B workflows and dependencies will require additional context and control for GenAI solutions to effectively elevate the overall business.

The point-and-click approach enabled by graphic user interfaces (GUIs) effectively binds users to a limited set of capabilities, and a restricted view of data that is based on the GUI requirements set by the business at the point of design. Multi-modal prompt interfaces (mainly text and audio) are fast changing that paradigm and expanding the UI/UX potential and scope. In the coming year, we’ll see B2B organizations increasingly leverage NLIs and context to “ask” specific questions about available data, freeing them from traditional constraints and offering a faster path to insight for complex queries and interactions.

A good example of this is the contact center and its system support chatbots as a B2C interface. Their user experience will continue to be transformed by GenAI-enabled NLIs and multi-modal assistants in 2024, but the natural next step is to enrich GenAI with additional context, enabling it to augment B2B dependencies (like services) and back-end systems interactions, like application programming interfaces (APIs) to further boost accuracy and reach, minimize response time, and enhance user satisfaction.

Meanwhile, as the relevance of in-context faster paths to insights increases and the associated GenAI-enabled data flows become mainstream, large action models (LAMs) will start to be considered as a potential future step to automate some of enterprise workflows, most likely starting in the realm of IT, security, and auditing and compliance.

Additional B2B considerations with GenAI


As Centoni said, GenAI will be increasingly leveraged in B2B interactions with users demanding more contextualized, personalized, and integrated solutions. “GenAI will offer APIs, interfaces, and services to access, analyze, and visualize data and insights, becoming pervasive across areas such as project management, software quality and testing, compliance assessments, and recruitment efforts. As a result, observability for AI will grow.”

As the use of GenAI grows exponentially, this will simultaneously amplify the need for comprehensive and deeper observability. AI revolutionizes the way we analyze and process data, and observability too is fast evolving with it to offer an even more intelligent and automated approach from monitoring and triage across real-time dependencies up to troubleshooting of complex systems and the deployment of automated actions and responses.

Observability over modern applications and systems, including those that are powered by or leverage AI capabilities, will be increasingly augmented by GenAI for root-cause analysis, predictive analysis and, for example, to drill down on multi-cloud resource allocation and costs, as well as the performance and security of digital experiences.

Driven by growing demand for integrated solutions they can adapt to their specific needs, B2B providers are turning to GenAI to power services that boost productivity and accomplish tasks more efficiently than their current systems and implementations. Among these is the ability to access and analyze vast volumes of data to derive insights that can be used to develop new products, optimize dependencies, as well as design and refine the digital experiences supported by applications.

Starting in 2024, GenAI will be an integral part of business context, therefore observability will naturally need to extend to it, making the full stack observability scope a bit wider. Besides costs, GenAI-enabled B2B interactions will be particularly sensitive to both latency and jitter. This fact alone will drive significant growth in demand over the coming year for end-to-end observability – including the internet, as well as critical networks, empowering these B2B interactions to keep AI-powered applications running at peak performance.

On the other hand, as businesses recognize potential pitfalls and seek increased control and flexibility over their AI models training, data retention, and expendability processes, the demand for either bespoke or both domain-specific GenAI large language models (LLMs) will also increase significantly in 2024. As a result, organizations will pick up the pace of adapting GenAI LLM models to their specific requirements and contexts by leveraging private data and introducing up-to-date information via retrieval augmented generation (RAG), fine-tuning parameters, and scaling models appropriately.

Moving fast towards contextual understanding and reasoning


GenAI has already evolved from reliance on a single data modality to include training on text, images, video, audio, and other inputs simultaneously. Just as humans learn by taking in multiple types of data to create more complete understanding, the growing ability of GenAI to consume multiple modalities is another significant step towards greater contextual understanding.

These multi-modal capabilities are still in the early stages, although they are already being considered for business interactions. Multi-modality is also key to the future of LAMs – sometimes called AI agents – as they bring complex reasoning and provide multi-hop thinking and the ability to generate actionable outputs.

True multi-modality not only improves overall accuracy, but it also exponentially expands the possible use cases, including for B2B applications. Consider a customer sentiment model tied to a forecast trending application that can capture and interpret audio, text, and video for complete insight that includes context such as tone of voice and body language, instead of simply transcribing the audio. Recent advances allow RAG to handle both text and images. In a multi-modal setup, images can be retrieved from a vector database and passed through a large multimodal model (LMM) for generation. The RAG method thus enhances the efficiency of tasks as it can be fine-tuned, and its knowledge can be updated easily without requiring entire model retraining.

With RAG in the picture, consider now a model that identifies and analyzes commonalities and patterns in job interviews data by consuming resumes, job requisitions across the industry (from peers and competitors), online activities (from social media up to posted lectures in video) but then being augmented by also consuming the candidate-recruiter emails interactions as well the actual interview video calls.   That example shows how both RAG and responsible AI will be in high demand during 2024.

In summary, in the year ahead we will begin to see a more robust emergence of specialized, domain-specific AI models. There will be a shift towards smaller, specialized LLMs that offer higher levels of accuracy, relevancy, precision, and efficiency for individual organizations and needs, along with niche domain understanding.

RAG and specialized LLMs and LMMs complement each other. RAG ensures accuracy and context, while smaller LLMs optimize efficiency and domain-specific performance. Still in the year ahead, LAM development and relevance will grow, focusing on the automation of user workflows while aiming to cover the “actions” aspect missing from LLMs.

The next frontier of GenAI will see evolutionary change and totally new aspects in B2B solutions.  Reshaping business processes, user experience, observability, security, and automated actions, this new AI-driven era is shaping itself up as we speak and 2024 will be an inflection point in that process.   Exciting times!

Source: cisco.com