Connect with us

Technology

What’s the minimum viable infrastructure your enterprise needs for AI?

Published

on

What’s the minimum viable infrastructure your enterprise needs for AI?

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


This article is part of a VB Special Issue called “Fit for Purpose: Tailoring AI Infrastructure.” Catch all the other stories here.

As we approach the midpoint of the 2020s decade, enterprises of all sizes and sectors are increasingly looking at how to adopt generative AI to increase efficiencies and reduce time spent on repetitive, onerous tasks.

In some ways, having some sort of generative AI application or assistant is rapidly moving from becoming a “nice to have” to a “must have.”

Advertisement

But what is the minimum viable infrastructure needed to achieve these benefits? Whether you’re a large organization or a small business, understanding the essential components of an AI solution is crucial.

This guide — informed by leaders in the sector including experts at Hugging Face and Google — outlines the key elements, from data storage and large language model (LLM) integration to development resources, costs and timelines, to help you make informed decisions.

>>Don’t miss our special issue: Fit for Purpose: Tailoring AI Infrastructure.<<

Data storage and data management

The foundation of any effective gen AI system is data — specifically your company’s data, or at least, data that is relevant to your firm’s business and/or goals.

Advertisement

Yes, your business can immediately use off-the-shelf chatbots powered by large language models (LLMs) such as Google’s Gemini, OpenAI’s ChatGPT, Anthropic Claude or other chatbots readily available on the web — which may assist with specific company tasks. And it can do so without inputting any company data.

However, unless you feed these your company’s data — which may not be allowed due to security concerns or company policies — you won’t be able to reap the full benefits of what LLMs can offer.

So step one in developing any helpful AI product for your company to use, internally or externally, is understanding what data you have and can share with an LLM, whether that be a public or private one you control on your own servers and where it is located. Also whether it is structured or unstructured.

Structured data is organized typically in databases and spreadsheets, with clearly defined fields like dates, numbers and text entries. For instance, financial records or customer data that fit neatly into rows and columns are examples of structured data.

Advertisement

Unstructured data, on the other hand, lacks a consistent format and is not organized in a predefined manner. It includes various types of content like emails, videos, social media posts and documents, which do not fit easily into traditional databases. This type of data is more challenging to analyze due to its diverse and non-uniform nature.

This data can include everything from customer interactions and HR policies to sales records and training materials. Depending on your use case for AI — developing products internally for employees or externally for customers — the route you go will likely change.

Let’s take a hypothetical furniture maker — the “Chair Company” — that makes chairs for consumers and businesses out of wood.

This Chair Company wants to create an internal chatbot for employees to use that can answer common questions such as how to file expenses, how to request time off and where files for building chairs are located.

Advertisement

The Chair Company may in this case already have these files stored on a cloud service such as Google Cloud, Microsoft Azure or AWS. For many businesses, integrating AI capabilities directly into existing cloud platforms can significantly simplify the deployment process.

Google Workspace, combined with Vertex AI, enables enterprises to leverage their existing data across productivity tools like Docs and Gmail.

A Google spokesperson explained to VentureBeat, “With Vertex AI’s Model Garden, businesses can choose from over 150 pre-built models to fit their specific needs, integrating them seamlessly into their workflows. This integration allows for the creation of custom agents within Google Workspace apps, streamlining processes and freeing up valuable time for employees.”

For example, Bristol Myers Squibb used Vertex AI to automate document processes in their clinical trials, demonstrating how powerful these integrations can be in transforming business operations. For smaller businesses or those new to AI, this integration provides a user-friendly entry point to harness the power of AI without extensive technical overhead.

Advertisement

But what if the company has data stored only on an intranet or local private servers? The Chair Company — or any other in a similar boat — can still leverage LLMs and build a chatbot to answer company questions. However, they will likely want to deploy one of many open-source models available from the coding community Hugging Face instead.

“If you’re in a highly regulated industry like banking or healthcare, you might need to run everything in-house,” explained Jeff Boudier, head of product and growth at Hugging Face, in a recent interview with VentureBeat. “In such cases, you can still use open-source tools hosted on your own infrastructure.”

Boudier recorded the following demo video for VentureBeat showing how to use Hugging Face’s website and available models and tools to create an AI assistant for a company.

Advertisement

A Large Language Model (LLM)

Once you’ve determined what company data you can and want to feed into an AI product, the next step is selecting which large language model (LLM) you wish to power it.

Choosing the right LLM is a critical step in building your AI infrastructure. LLMs such as OpenAI’s GPT-4, Google’s DialogFlow, and the open models hosted on Hugging Face offer different capabilities and levels of customization. The choice depends on your specific needs, data privacy concerns and budget.

Those charged with overseeing and implementing AI integration at a company will need to assess and compare different LLMs, which they can do using websites and services such as the LMSYS Chatbot Arena Leaderboard on Hugging Face.

If you go the route of a proprietary LLM such as OpenAI’s GPT series, Anthropic’s Claude family or Google’s Gemini series, you’ll need to find and plug the LLM into your database via the LLM provider’s private application programming interface (API).

Advertisement

Meanwhile, if the Chair Company or your business wants to host a model on its own private infrastructure for enhanced control and data security, then an open-source LLM is likely the way to go.

As Boudier explains, “The main benefit of open models is that you can host them yourself. This ensures that your application’s behavior remains consistent, even if the original model is updated or changed.”

Already, VentureBeat has reported on the growing number of businesses adopting open source LLMs and AI models from the likes of Meta’s Llama and other providers and independent developers.

Retrieval-Augmented Generation (RAG) framework

For a chatbot or AI system to provide accurate and relevant responses, integrating a retrieval augmented generation (RAG) framework is essential.

Advertisement

This involves using a retriever to search for relevant documents based on user queries and a generator (an LLM) to synthesize the information into coherent responses.

Implementing an RAG framework requires a vector database like Pinecone or Milvus, which stores document embeddings—structured representations of your data that make it easy for the AI to retrieve relevant information.

The RAG framework is particularly useful for enterprises that need to integrate proprietary company data stored in various formats, such as PDFs, Word documents and spreadsheets.

This approach allows the AI to pull relevant data dynamically, ensuring that responses are up-to-date and contextually accurate.

Advertisement

According to Boudier, “Creating embeddings or vectorizing documents is a crucial step in making data accessible to the AI. This intermediate representation allows the AI to quickly retrieve and utilize information, whether it’s text-based documents or even images and diagrams.”

Development expertise and resources

While AI platforms are increasingly user-friendly, some technical expertise is still required for implementation. Here’s a breakdown of what you might need:

  • Basic Setup: For straightforward deployment using pre-built models and cloud services, your existing IT staff with some AI training should suffice.
  • Custom Development: For more complex needs, such as fine-tuning models or deep integration into business processes, you’ll need data scientists, machine learning engineers, and software developers experienced in NLP and AI model training.

For businesses lacking in-house resources, partnering with an external agency is a viable option. Development costs for a basic chatbot range from $15,000 to $30,000, while more complex AI-driven solutions can exceed $150,000.

“Building a custom AI model is accessible with the right tools, but you’ll need technical expertise for more specialized tasks, like fine-tuning models or setting up a private infrastructure,” Boudier noted. “With Hugging Face, we provide the tools and community support to help businesses, but having or hiring the right talent is still essential for successful implementation.”

For businesses without extensive technical resources, Google’s AppSheet offers a no-code platform that allows users to create custom applications by simply describing their needs in natural language. Integrated with AI capabilities like Gemini, AppSheet enables rapid development of tools for tasks such as facility inspections, inventory management and approval workflows—all without traditional coding skills. This makes it a powerful tool for automating business processes and creating customized chatbots.

Advertisement

Time and budget considerations

Implementing an AI solution involves both time and financial investment. Here’s what to expect:

  • Development Time: A basic chatbot can be developed in 1-2 weeks using pre-built models. However, more advanced systems that require custom model training and data integration may take several months.
  • Cost: For in-house development, budget around $10,000 per month, with total costs potentially reaching $150,000 for complex projects. Subscription-based models offer more affordable entry points, with costs ranging from $0 to $5,000 per month depending on features and usage.

Deployment and maintenance

Once developed, your AI system will need regular maintenance and updates to stay effective. This includes monitoring, fine-tuning and possibly retraining the model as your business needs and data evolve. Maintenance costs can start at $5,000 per month, depending on the complexity of the system and the volume of interactions.

If your enterprise operates in a regulated industry like finance or healthcare, you may need to host the AI system on private infrastructure to comply with data security regulations. Boudier explained, “For industries where data security is paramount, hosting the AI model internally ensures compliance and full control over data and model behavior.”

Final takeaways

To set up a minimum viable AI infrastructure for your enterprise, you need:

  • Cloud Storage and Data Management: Organize and manage your data efficiently using an intranet, private servers, private clouds, hybrid clouds or commercial cloud platforms like Google Cloud, Azure or AWS.
  • A Suitable LLM: Choose a model that fits your needs, whether hosted on a cloud platform or deployed on private infrastructure.
  • A RAG Framework: Implement this to dynamically pull and integrate relevant data from your knowledge base.
  • Development Resources: Consider in-house expertise or external agencies for building, deploying, and maintaining your AI system.
  • Budget and Time Allocation: Prepare for initial costs ranging from $15,000 to $150,000 and development time of a few weeks to several months, depending on complexity.
  • Ongoing Maintenance: Regular updates and monitoring are necessary to ensure the system remains effective and aligned with business goals.

By aligning these elements with your business needs, you can create a robust AI solution that drives efficiency, automates tasks, and provides valuable insights—all while maintaining control over your technology stack.


Source link
Continue Reading
Advertisement
Click to comment

You must be logged in to post a comment Login

Leave a Reply

Technology

Samsung unboxes the all-new Galaxy S24 FE: Video

Published

on

Samsung unboxes the all-new Galaxy S24 FE: Video

The Samsung Galaxy S24 FE was announced yesterday, and the company has just unboxed it. This unboxing was presented in the form of a promo video which surfaced on the company’s official YouTube channel.

The Galaxy S24 FE unboxing video is now live on YouTube

The unboxing video has a duration of around a minute and a half, and it’s embedded at the very end of the article. It not only shows you what sits in the box, but you get to see all the colors of the device. On top of that, some features are highlighted here too.

In addition to the phone itself, and some paperwork, a SIM ejection pin is included, and a charging cable. That’s a USB-C to USB-C charging cable if you were wondering. No, the charger is not included in the box.

The phone comes in Blue, Graphite, Gray, Mint, and Yellow colors, in case you missed the memo yesterday. This video will show you the device from all angles. That way you’ll see its flat sides, which is a change from last year’s model. The device is also larger this time around, as it has a larger display.

Advertisement

Samsung claims this phone offers “stunning low-light portraits”

Samsung says that the Galaxy S24 FE can provide “stunning low-light portraits” thanks to ProVisual Engine. The cameras are highlighted in the video in general, and the same goes for the SoC. This phone is fueled by the Exynos 2400e chip.

A 6.7-inch AMOLED display with a 120Hz refresh rate is used this time around, instead of a 6.4-inch panel. It’s also brighter now, as it has a peak brightness of 1,900 nits. Gorilla Glass Victus+ protects that display, by the way.

The device is also water and dust-resistant, and so on. If you’d like to know more about the Galaxy S24 FE, check out our original announcement. You can also pre-order the device as we speak, if you’re interested.

Source link

Advertisement

Continue Reading

Science & Environment

WTI heads for weekly loss as supplies rise

Published

on

WTI heads for weekly loss as supplies rise


The oil market today doesn't preemptively price in risk, says S&P Global's Dan Yergin

U.S. crude oil on Friday was on pace for its first weekly loss in three weeks, as the prospect of growing oil supplies from Saudi Arabia overshadowed China’s efforts to stimulate its economy.

The U.S. benchmark West Texas Intermediate is down nearly 6% this week, while global benchmark Brent has pulled back nearly 4%. Prices have fallen even as conflict in the Middle East escalates, with Israel and Hezbollah trading blows in Lebanon.

“It is amazing to see that … war doesn’t affect the price, and that’s because there’s been no disruption,” Dan Yergin, vice chairman of S&P Global, told CNBC’s “Squawk Box” Friday.

“There’s still over 5 million barrels a day of shut in capacity in the Middle East,” Yergin said.

Here are Friday’s energy prices:

Advertisement

  • West Texas Intermediate November contract: $67.51 per barrel, down 16 cents, or 0.24%. Year to date, U.S. crude oil is down more than 5%.
  • Brent November contract: $71.37 per barrel, off 23 cents, or 0.32%. Year to date, the global benchmark is down about 7%.
  • RBOB Gasoline October contract: $1.9596 per gallon, little changed. Year to date, gasoline is down about 7%.
  • Natural Gas November contract: $2.774 per thousand cubic feet, up 0.76%. Year to date, gas is up about 10%.

Oil sold off Thursday on a report that Saudi Arabia is committed to increasing production later this year, even if it results in lower prices for a pro-longed period.

OPEC+ recently postponed planned output hikes from October to December, but analysts have speculated that the group might delay the hikes again because oil prices are so low.

The oil selloff erased gains from earlier in the week after China unveiled a new round of economic stimulus measures. Soft demand in China has been weighing on the oil market for months.

“The thing that’s dominated the market is the weakness in China. Half the growth in world oil demand over a number of years has simply been in China, and it hasn’t been happening,” Yergin said.

“The big question is, stimulus, will you see a recovery in China,” he said. “That’s what the market is struggling with.”

Advertisement

Don’t miss these energy insights from CNBC PRO:



Source link

Continue Reading

Servers computers

42U Adjustable Depth Open Frame 4 Post Server Rack Cabinet – 4POSTRACK42 | StarTech.com

Published

on

42U Adjustable Depth Open Frame 4 Post Server Rack Cabinet - 4POSTRACK42 | StarTech.com



The 4POSTRACK42 42U Server Rack lets you store your servers, network and telecommunications equipment in a sturdy, adjustable depth open-frame rack.

Designed with ease of use in mind, this 42U rack offers easy-to-read markings for both rack units (U) and depth, with a wide range of mounting depth adjustments (22 – 40in) that make it easy to adapt the rack to fit your equipment.

This durable 4-post rack supports a static loading capacity of up to 1320lbs (600kg), and offers compliance with several industry rack standards (EIA/ECA-310, IEC 60297, DIN 41494) for a universal design that’s compatible with most rack equipment.

For a complete rack solution that saves you time and hassle, the rack includes optional accessories such as casters, leveling feet and cable management hooks. The base is also pre-drilled for securing the rack to the floor if needed, providing you with many options to customize the rack to fit your environment.

Backed by a StarTech.com 2-year warranty and free lifetime technical support.

To learn more visit StarTech.com

source

Continue Reading

Technology

Intel reportedly rebuffed an offer from ARM to buy its product unit

Published

on

Intel reportedly rebuffed an offer from ARM to buy its product unit

Intel’s fortunes have declined so rapidly over the past year that chip designer ARM made a “high level inquiry” about buying its crown jewel product unit, Bloomberg reported. However, Intel said the division wasn’t for sale and turned down the offer, according to an unnamed insider.

There are two main units inside Intel, the product group that sells PC, server and networking chips and a chip manufacturing foundry. ARM had no interest in Intel’s foundry division, according to Bloomberg‘s sources. ARM and Intel representatives declined to comment.

Intel’s fortunes have been on the wane for years, but the decline over the last 12 months has been especially dramatic. Following a net $1.6 billion loss in Q2 2024, the company announced that it was laying off 15,000 employees as part of a $10 billion cost reduction plan. Last week, the company also revealed plans to transform its ailing foundry business into an independent subsidiary. Intel lost half its market value last year and is now worth $102.3 billion.

ARM sells its processor designs to Qualcomm, Apple and other manufacturers (mostly for mobile phones) but doesn’t build any chips itself. Purchasing Intel’s product division would completely transform its business model, though that scenario seems highly improbable.

Advertisement

With Intel wounded at the moment, rivals have been circling. Qualcomm also expressed interest in taking over Intel recently, according to a report from last week. Any mergers related to ARM and Qualcomm would be regulatory nightmares, but the fact that the offers exist at all shows Intel’s vulnerability.

Intel has other avenues to boost investment. Apollo Global Management (the owner of Yahoo and Engadget) has offered to invest as much as $5 billion in the company, according to a recent Bloomberg report. Intel also plans to sell part of its stake in chip-maker Altera to private equity investors.

Source link

Continue Reading

Servers computers

Wallmount Rack Server 9U/Rak Server Ukuran 9U Single Glass Door #servers #komputer

Published

on

Wallmount Rack Server 9U/Rak Server Ukuran 9U Single Glass Door #servers  #komputer



Wallmount Rack Server 9U/Rak Server Ukuran 9U Single Glass Door Di rakit oleh siswa dan siswi jurusan teknik komputer jarngan .

source

Continue Reading

Technology

From cost center to competitive edge: The strategic value of custom AI Infrastructure

Published

on

From cost center to competitive edge: The strategic value of custom AI Infrastructure

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


This article is part of a VB Special Issue called “Fit for Purpose: Tailoring AI Infrastructure.” Catch all the other stories here.

AI is no longer just a buzzword — it’s a business imperative. As enterprises across industries continue to adopt AI, the conversation around AI infrastructure has evolved dramatically. Once viewed as a necessary but costly investment, custom AI infrastructure is now seen as a strategic asset that can provide a critical competitive edge.

Mike Gualtieri, vice president and principal analyst at Forrester, emphasizes the strategic importance of AI infrastructure. “Enterprises must invest in an enterprise AI/ML platform from a vendor that at least keeps pace with, and ideally pushes the envelope of, enterprise AI technology,” Gualtieri said. “The technology must also serve a reimagined enterprise operating in a world of abundant intelligence.” This perspective underscores the shift from viewing AI as a peripheral experiment to recognizing it as a core component of future business strategy.

Advertisement

The infrastructure revolution

The AI revolution has been fueled by breakthroughs in AI models and applications, but those innovations have also created new challenges. Today’s AI workloads, especially around training and inference for large language models (LLMs), require unprecedented levels of computing power. This is where custom AI infrastructure comes into play.

>>Don’t miss our special issue: Fit for Purpose: Tailoring AI Infrastructure.<<

“AI infrastructure is not one-size-fits-all,” says Gualtieri. “There are three key workloads: data preparation, model training and inference.” Each of these tasks has different infrastructure requirements, and getting it wrong can be costly, according to Gualtieri. For example, while data preparation often relies on traditional computing resources, training massive AI models like GPT-4o or LLaMA 3.1 necessitates specialized chips such as Nvidia’s GPUs, Amazon’s Trainium or Google’s TPUs.

Nvidia, in particular, has taken the lead in AI infrastructure, thanks to its GPU dominance. “Nvidia’s success wasn’t planned, but it was well-earned,” Gualtieri explains. “They were in the right place at the right time, and once they saw the potential of GPUs for AI, they doubled down.” However, Gualtieri believes that competition is on the horizon, with companies like Intel and AMD looking to close the gap.

Advertisement

The cost of the cloud

Cloud computing has been a key enabler of AI, but as workloads scale, the costs associated with cloud services have become a point of concern for enterprises. According to Gualtieri, cloud services are ideal for “bursting workloads” — short-term, high-intensity tasks. However, for enterprises running AI models 24/7, the pay-as-you-go cloud model can become prohibitively expensive.

“Some enterprises are realizing they need a hybrid approach,” Gualtieri said. “They might use the cloud for certain tasks but invest in on-premises infrastructure for others. It’s about balancing flexibility and cost-efficiency.”

This sentiment was echoed by Ankur Mehrotra, general manager of Amazon SageMaker at AWS. In a recent interview, Mehrotra noted that AWS customers are increasingly looking for solutions that combine the flexibility of the cloud with the control and cost-efficiency of on-premise infrastructure. “What we’re hearing from our customers is that they want purpose-built capabilities for AI at scale,” Mehrotra explains. “Price performance is critical, and you can’t optimize for it with generic solutions.”

To meet these demands, AWS has been enhancing its SageMaker service, which offers managed AI infrastructure and integration with popular open-source tools like Kubernetes and PyTorch. “We want to give customers the best of both worlds,” says Mehrotra. “They get the flexibility and scalability of Kubernetes, but with the performance and resilience of our managed infrastructure.”

Advertisement

The role of open source

Open-source tools like PyTorch and TensorFlow have become foundational to AI development, and their role in building custom AI infrastructure cannot be overlooked. Mehrotra underscores the importance of supporting these frameworks while providing the underlying infrastructure needed to scale. “Open-source tools are table stakes,” he says. “But if you just give customers the framework without managing the infrastructure, it leads to a lot of undifferentiated heavy lifting.”

AWS’s strategy is to provide a customizable infrastructure that works seamlessly with open-source frameworks while minimizing the operational burden on customers. “We don’t want our customers spending time on managing infrastructure. We want them focused on building models,” says Mehrotra.

Gualtieri agrees, adding that while open-source frameworks are critical, they must be backed by robust infrastructure. “The open-source community has done amazing things for AI, but at the end of the day, you need hardware that can handle the scale and complexity of modern AI workloads,” he says.

The future of AI infrastructure

As enterprises continue to navigate the AI landscape, the demand for scalable, efficient and custom AI infrastructure will only grow. This is especially true as artificial general intelligence (AGI) — or agentic AI — becomes a reality. “AGI will fundamentally change the game,” Gualtieri said. “It’s not just about training models and making predictions anymore. Agentic AI will control entire processes, and that will require a lot more infrastructure.”

Advertisement

Mehrotra also sees the future of AI infrastructure evolving rapidly. “The pace of innovation in AI is staggering,” he says. “We’re seeing the emergence of industry-specific models, like BloombergGPT for financial services. As these niche models become more common, the need for custom infrastructure will grow.”

AWS, Nvidia and other major players are racing to meet this demand by offering more customizable solutions. But as Gualtieri points out, it’s not just about the technology. “It’s also about partnerships,” he says. “Enterprises can’t do this alone. They need to work closely with vendors to ensure their infrastructure is optimized for their specific needs.”

Custom AI infrastructure is no longer just a cost center — it’s a strategic investment that can provide a significant competitive edge. As enterprises scale their AI ambitions, they must carefully consider their infrastructure choices to ensure they are not only meeting today’s demands but also preparing for the future. Whether through cloud, on-premises, or hybrid solutions, the right infrastructure can make all the difference in turning AI from an experiment into a business driver


Source link
Continue Reading

Trending

Copyright © 2024 WordupNews.com