Connect with us

Crypto World

Lessons Learned After a Year of Building with Large Language Models (LLMs)

Published

on

Lessons Learned After a Year of Building with Large Language Models (LLMs)

Over the past year, Large Language Models (LLMs) have reached impressive competence for real-world applications. Their performance continues to improve, and costs are decreasing, with a projected $200 billion investment in artificial intelligence by 2025. Accessibility through provider APIs has democratised access to these technologies, enabling ML engineers, scientists, and anyone to integrate intelligence into their products. However, despite the lowered entry barriers, creating effective products with LLMs remains a significant challenge. This is summary of the original paper of the same name by https://applied-llms.org/. Please refer to that documento for detailed information.

Fundamental Aspects of Working with LLMs


· Prompting Techniques


Prompting is one of the most critical techniques when working with LLMs, and it is essential for prototyping new applications. Although often underestimated, correct prompt engineering can be highly effective.

– Fundamental Techniques: Use methods like n-shot prompts, in-context learning, and chain-of-thought to enhance response quality. N-shot prompts should be representative and varied, and chain-of-thought should be clear to reduce hallucinations and improve user confidence.

Structuring Inputs and Outputs: Structured inputs and outputs facilitate integration with subsequent systems and enhance clarity. Serialisation formats and structured schemas help the model better understand the information.

Advertisement

– Simplicity in Prompts: Prompts should be clear and concise. Breaking down complex prompts into more straightforward steps can aid in iteration and evaluation.

– Token Context: It’s crucial to optimise the amount of context sent to the model, removing redundant information and improving structure for clearer understanding.


· Retrieval-Augmented Generation (RAG)


RAG is a technique that enhances LLM performance by providing additional context by retrieving relevant documents.


– Quality of Retrieved Documents: The relevance and detail of the retrieved documents impact output quality. Use metrics such as Mean Reciprocal Rank (MRR) and Normalised Discounted Cumulative Gain (NDCG) to assess quality.

Advertisement

– Use of Keyword Search: Although vector embeddings are useful, keyword search remains relevant for specific queries and is more interpretable.

– Advantages of RAG over Fine-Tuning: RAG is more cost-effective and easier to maintain than fine-tuning, offering more precise control over retrieved documents and avoiding information overload.


Optimising and Tuning Workflows


Optimising workflows with LLMs involves refining and adapting strategies to ensure efficiency and effectiveness. Here are some key strategies:


· Step-by-Step, Multi-Turn Flows


Decomposing complex tasks into manageable steps often yields better results, allowing for more controlled and iterative refinement.

Advertisement

– Best Practices: Ensure each step has a defined goal, use structured outputs to facilitate integration, incorporate a planning phase with predefined options, and validate plans. Experimenting with task architectures, such as linear chains or Directed Acyclic Graphs (DAGs), can optimise performance.


· Prioritising Deterministic Workflows


Ensuring predictable outcomes is crucial for reliability. Use deterministic plans to achieve more consistent results.

Benefits: It facilitates controlled and reproducible results, makes tracing and fixing specific failures easier, and DAGs adapt better to new situations than static prompts.

– Approach: Start with general objectives and develop a plan. Execute the plan in a structured manner and use the generated plans for few-shot learning or fine-tuning.

Advertisement


· Enhancing Output Diversity Beyond Temperature


Increasing temperature can introduce diversity but only sometimes guarantees a good distribution of outputs. Use additional strategies to improve variety.

– Strategies: Modify prompt elements such as item order, maintain a list of recent outputs to avoid repetitions, and use different phrasings to influence output diversity.


· The Underappreciated Value of Caching


Caching is a powerful technique for reducing costs and latency by storing and reusing responses.

– Approach: Use unique identifiers for cacheable items and employ caching techniques similar to search engines.

Advertisement

– Benefits: Reduces costs by avoiding recalculation of responses and serves vetted responses to reduce risks.


· When to Fine-Tune


Fine-tuning may be necessary when prompts alone do not achieve the desired performance. Evaluate the costs and benefits of this technique.

– Examples: Honeycomb improved performance in specific language queries through fine-tuning. Rechat achieved consistent formatting by fine-tuning the model for structured data.

– Considerations: Assess if the cost of fine-tuning justifies the improvement and use synthetic or open-source data to reduce annotation costs.

Advertisement


Evaluation and Monitoring


Effective evaluation and monitoring are crucial to ensuring LLM performance and reliability.

· Assertion-Based Unit Tests


Create unit tests with real input/output examples to verify the model’s accuracy according to specific criteria.

– Approach: Define assertions to validate outputs and verify that the generated code performs as expected.


· LLM-as-Judge

Use an LLM to evaluate the outputs of another LLM. Although imperfect, it can provide valuable insights, especially in pairwise comparisons.

Advertisement

– Best Practices: Compare two outputs to determine which is better, mitigate biases by alternating the order of options and allowing ties, and have the LLM explain its decision to improve evaluation reliability.


· The “Intern Test”

Evaluate whether an average university student could complete the task given the input and context provided to the LLM.

– Approach: If the LLM lacks the necessary knowledge, enrich the context or simplify the task. Decompose complex tasks into simpler components and investigate failure patterns to understand model shortcomings.


· Avoiding Overemphasis on Certain Evaluations

Do not focus excessively on specific evaluations that might distort overall performance metrics.

Advertisement

Example: A needle-in-a-haystack evaluation can help measure recall but does not fully capture real-world performance. Consider practical assessments that reflect real use cases.


Key Takeaways


The lessons learned from building with LLMs underscore the importance of proper prompting techniques, information retrieval strategies, workflow optimisation, and practical evaluation and monitoring methodologies. Applying these principles can significantly enhance your LLM-based applications’ effectiveness, reliability, and efficiency. Stay updated with advancements in LLM technology, continuously refine your approach, and foster a culture of ongoing learning to ensure successful integration and an optimised user experience.

Source link

Advertisement
Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Crypto World

Feds Crypto Trace Gets Incognito Market Creator 30 Years

Published

on

Dark Markets, Court, Dark Web

The creator of Incognito Market, the online black market that used crypto as its economic heart, has been sentenced to 30 years in prison after some blockchain sleuthing led US authorities straight to the platform’s steward.

The Justice Department said on Wednesday that a Manhattan court gave Rui-Siang Lin three decades behind bars for owning and operating Incognito, which sold $105 million worth of illicit narcotics between its launch in October 2020 and its closure in March 2024.

Lin, who pleaded guilty to his role in December 2024, was sentenced for conspiring to distribute narcotics, money laundering, and conspiring to sell misbranded medication.

Incognito allowed users to buy and sell drugs using Bitcoin (BTC) and Monero (XMR) while taking a 5% cut, and Lin’s undoing ultimately came after the FBI traced the platform’s crypto to an account in Lin’s name at a crypto exchange.

Advertisement

“Today’s sentence puts traffickers on notice: you cannot hide in the shadows of the Internet,” said Manhattan US Attorney Jay Clayton. “Our larger message is simple: the internet, ‘decentralization,’ ‘blockchain’ — any technology — is not a license to operate a narcotics distribution business.”

Dark Markets, Court, Dark Web
Source: US Attorney SDNY

In addition to prison time, Lin was sentenced to five years of supervised release and ordered to pay more than $105 million in forfeiture.

Crypto tracing led FBI right to Lin

In March 2024, the Justice Department said Lin closed Incognito and stole at least $1 million that its users had deposited in their accounts on the platform.

Lin, known online as “Pharoah,” then attempted to blackmail Incognito’s users, demanding that buyers and vendors pay him or he would publicly share their user history and crypto addresses.

Lin wrote “YES, THIS IS AN EXTORTION!!!” in a post to Incognito’s website. Source: Department of Justice

Months later, in May 2024, authorities arrested Lin, a Taiwanese national, at New York’s John F. Kennedy Airport after the FBI tied him to Incognito partly by tracing the platform’s crypto transfers to a crypto exchange account in Lin’s name.

The FBI said a crypto wallet that Lin controlled received funds from a known wallet of Incognito’s, and those funds were then sent to Lin’s exchange account.

Advertisement

Related: AI-enabled scams rose 500% in 2025 as crypto theft goes ‘industrial’

The agency said it traced at least four transfers showing Lin’s crypto wallet sent Bitcoin originally from Incognito to a “swapping service” to exchange it for XMR, which was then deposited to the exchange account.

The exchange gave the FBI a photo of Lin’s Taiwanese driver’s license used to open the account, along with an email address and phone number, and the agency tied the email and number to an account at the web domain registrar Namecheap.