AI驱动的财务分析:多代理LLM系统将数据转化为见解

AI驱动的财务分析:多代理LLM系统将数据转化为见解

本文分享了开发一个多智能体LLM系统的经验,该系统使用语言模型从互联网上收集相关信息,对Reddit上的评论进行情绪分析,进行基础和技术分析,并总结有关股票的信息。该系统可通过Streamlit访问。如果你想了解更多关于LLM的相关内容,可以阅读以下这些文章:
大型语言模型景观
利用大型语言模型进行因果推理:为什么知识和算法至关重要?
如何为你的业务选择合适的大型语言模型(LLM)
语言模型在虚假信息活动中存在误用——如何降低风险?

在开始项目之前,我想谈谈我在这个项目中使用的技术和框架:

1. CrewAI:多智能体系统。

2. LangChain:工具和模型的使用。

3. Reddit (PRAW) 和雅虎财经 (yfinance):数据收集。

4. Hugging Face:金融的精细情绪模型。

5. Groq Cloud 和 OpenAI:访问LLaMA和GPT模型。

6. Streamlit:UI和部署。

CrewAI是一种协作工作系统,旨在使各种人工智能代理作为一个团队高效地完成复杂任务。

代理、任务和工具

  • 代理:每个代理都是一个AI单元,承担特定角色并根据该角色工作,旨在实现特定目标并执行特定任务。
  • 任务:任务是代理需要完成的特定活动及其执行说明,定义了预期结果和将要使用的工具。
  • 工具:工具是代理在执行任务时可以使用的附加资源和集成,提供数据收集、分析或访问特定API等功能。

在这个项目中,我整合了Meta和OpenAI的语言模型,创建了自定义工具,并开发了研究员、技术分析师、基本分析师及报告等代理来收集股票信息。我将分享我在这个Crew系统中的经验。

LangChain是一个基于大型语言模型(LLM)构建应用程序的开源框架,提供了工具和抽象以定制和提高模型生成的信息的准确性和相关性。开发者可以使用LangChain组件创建新的提示链或自定义现有模板,同时也允许LLM访问新的数据集,而无需重新训练。

在这个项目中,我使用YahooFinanceNewsTool获取股票信息,并访问Groq和OpenAI模型。

PRAW(Python Reddit API Wrapper)是一个用于轻松访问Reddit API的库,提供了用户友好的接口来获取Reddit数据、浏览子Reddit、创建帖子和评论。

YFINANCE是一个Python库,用于从雅虎财经获取财务数据,允许用户轻松获取各种财务数据,如股票价格、历史数据、财务报告等。

在这个项目中,我使用yfinance进行基础和技术分析及可视化,使用PRAW从Reddit获取数据,分析用户对“wallstreetbets”、“stocks”和“investing”子Reddit上股票的看法。

Hugging Face是一家技术公司,也是一个开源平台,为构建、培训和部署机器学习模型提供工具和资源。Hugging Face Hub为各种机器学习任务提供了许多模型。

在这个项目中,我使用了Hugging Face模型“distilroberta-finetuned-financial-news-sentiment-analysis”来分析Reddit数据的情绪。

Groq是一家技术公司,开发创新的硬件和软件解决方案以支持机器学习和人工智能。他们的产品旨在为复杂的人工智能工作负载提供高性能、低延迟的计算。

在这个项目中,我整合了来自Groq Cloud和OpenAI平台的GPT-4o、GPT-4o Mini、LLaMA 3.8B、LLaMA 3.1 8B和LLaMA 3.1 70B语言模型。

Streamlit是一个开源框架,用于创建交互式Web应用程序,专门针对机器学习、数据科学和LLM项目。Streamlit Cloud是一个直接从GitHub存储库部署、管理和共享应用程序的平台。

在我的项目中,我使用Streamlit创建了一个交互界面,并将其部署在Streamlit Cloud上。

多智能体系统需要检查的步骤:

  1. 自定义工具的财务分析
  2. 定义Crew
  3. 开发Streamlit应用程序并部署

为了保持文章的简洁,我不会深入讨论所有代码的详细解释。如需更多信息,请随时与我联系。我的联系方式将在页面的末尾共享。

工具是代理可以用来更有效地执行特定任务的专用组件。它们提供各种功能,如数据收集、分析、与API交互或访问特定资源,甚至可以与LangChain工具集成。

在本文中,我将解释情绪分析工具,并包含我使用的其他工具的描述。

  • 浏览器工具:实现网页浏览功能。
  • 搜索工具:使用SERPER开发特定的搜索操作。
  • 情绪分析工具:构建一个工具来分析财务数据中的情绪。
  • 基本分析工具:使用雅虎财经创建一个基本股票分析工具。
  • Yahoo Finance News Tool:从LangChain的工具中收集和分析与股票相关的新闻。
  • 技术分析工具:使用雅虎财经实现技术股票分析。

首先,我从transformers库中加载模型和标记器,用于情绪分析。然后,analyze_sentiment函数对给定文本执行情绪分析,并将结果返回为“负面”、“中性”或“正面”。get_reddit_posts函数从特定子Reddit检索包含特定股票代码的帖子,使用PRAW库连接到Reddit API并获取帖子。这些帖子被过滤,仅包括最近30天的帖子。’@tool’装饰器允许将该函数用作CrewAI工具。reddit_sentiment_analysis函数对来自指定子Reddit的关于股票代码的帖子执行情绪分析,并返回每个情感标签的计数。

# sentiment_analysis_tool.py
import os
import praw
import torch
from datetime import datetime, timedelta
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from crewai_tools import tool

# Download hf model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis")
model = AutoModelForSequenceClassification.from_pretrained("mrm8488/distilroberta-finetuned-financial-news-sentiment-analysis")

def analyze_sentiment(text):
    """
    Analyze the sentiment of a given text.
    """
    inputs = tokenizer(text, return_tensors="pt")
    outputs = model(**inputs)
    scores = outputs.logits.softmax(dim=1).detach().numpy()[0]
    labels = ["negative", "neutral", "positive"]
    label = labels[scores.argmax()]
    return label

def get_reddit_posts(subreddit_name, stock_symbol, limit=100, days=30):
    """
    Get posts from a specific subreddit containing the stock symbol within the last specified days.
    """
    reddit = praw.Reddit(
        client_id=os.getenv("REDDIT_CLIENT_ID"),
        client_secret=os.getenv("REDDIT_CLIENT_SECRET"),
        user_agent=os.getenv("REDDIT_USER_AGENT")
    )
    subreddit = reddit.subreddit(subreddit_name)
    end_date = datetime.utcnow()
    start_date = end_date - timedelta(days=30)
    
    posts = []
    for post in subreddit.search(stock_symbol, sort='new', time_filter='month', limit=limit):
        post_date = datetime.utcfromtimestamp(post.created_utc)
        if start_date <= post_date <= end_date:
            posts.append(post.title)
    return posts

@tool
def reddit_sentiment_analysis(stock_symbol: str, subreddits: list = ['wallstreetbets', 'stocks', 'investing'], limit: int = 100):
    """
    Perform sentiment analysis on posts from specified subreddits about a stock symbol.
    
    Args:
        stock_symbol (str): The stock symbol to search for.
        subreddits (list): List of subreddits to search in.
        limit (int): Number of posts to fetch from each subreddit.
    
    Returns:
        list: List of sentiment labels for each post.
    """
    all_sentiments = []
    sentiments_counts={'neutral': 0, 'negative': 0, 'positive': 0}
    
    for subreddit in subreddits:
        posts = get_reddit_posts(subreddit, stock_symbol, limit)
        for post in posts:
            sentiment = analyze_sentiment(post)
            all_sentiments.append((sentiment))
            sentiments_counts[sentiment]+=1

    return sentiments_counts

对于模型选择,我使用LLaMA或OpenAI创建了一个名为 initialize_llm 的函数,并编写了 create_crew 函数来定义要使用的工具。我定义了代理和任务。在创建代理时,我确保提供明确的目标和详细的背景信息,使其更有能力和针对性。对于任务,我提供了清晰的指示和预期的输出,使其更易于理解和实现。使用 Crew 类,我整合了所有代理和任务,组建并启动团队。最后,我将结果保存到文件中,并将其返回给用户。

import os
from crewai import Agent, Task, Crew, Process
from crewai_tools import SerperDevTool, ScrapeWebsiteTool, WebsiteSearchTool
from tools.sentiment_analysis_tool import reddit_sentiment_analysis
from tools.yf_tech_analysis_tool import yf_tech_analysis
from tools.yf_fundamental_analysis_tool import yf_fundamental_analysis
from langchain_community.tools.yahoo_finance_news import YahooFinanceNewsTool
from langchain_groq import ChatGroq
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
# Environment Variables
load_dotenv()
os.environ["SERPER_API_KEY"] = os.getenv("SERPER_API_KEY")
os.environ["REDDIT_CLIENT_ID"] = os.getenv("REDDIT_CLIENT_ID")
os.environ["REDDIT_CLIENT_SECRET"] = os.getenv("REDDIT_CLIENT_SECRET")
os.environ["REDDIT_USER_AGENT"] = os.getenv("REDDIT_USER_AGENT")
# Model Selection
def initialize_llm(model_option, openai_api_key, groq_api_key):
    if model_option == 'OpenAI GPT-4o':
        return ChatOpenAI(openai_api_key=openai_api_key, model='gpt-4o', temperature=0.1)
    elif model_option == 'OpenAI GPT-4o Mini':
        return ChatOpenAI(openai_api_key=openai_api_key, model='gpt-4o-mini', temperature=0.1)
    elif model_option == 'Llama 3 8B':
        return ChatGroq(groq_api_key=groq_api_key, model='llama3-8b-8192', temperature=0.1)
    elif model_option == 'Llama 3.1 70B':
        return ChatGroq(groq_api_key=groq_api_key, model='llama-3.1-70b-versatile', temperature=0.1)
    elif model_option == 'Llama 3.1 8B':
        return ChatGroq(groq_api_key=groq_api_key, model='llama-3.1-8b-instant', temperature=0.1)
    else:
        raise ValueError("Invalid model option selected")
def create_crew(stock_symbol, model_option, openai_api_key, groq_api_key):
    llm = initialize_llm(model_option, openai_api_key, groq_api_key)
    # Tools Initialization
    reddit_tool = reddit_sentiment_analysis
    serper_tool = SerperDevTool()
    yf_tech_tool = yf_tech_analysis
    yf_fundamental_tool = yf_fundamental_analysis
    # Agents Definitions
    researcher = Agent(
        role='Senior Stock Market Researcher',
        goal='Gather and analyze comprehensive data about {stock_symbol}',
        verbose=True,
        memory=True,
        backstory="With a Ph.D. in Financial Economics and 15 years of experience in equity research, you're known for your meticulous data collection and insightful analysis.",
        tools=[reddit_tool, serper_tool, YahooFinanceNewsTool()],
        llm=llm
    )
    technical_analyst = Agent(
        role='Expert Technical Analyst',
        goal='Perform an in-depth technical analysis on {stock_symbol}',
        verbose=True,
        memory=True,
        backstory="As a Chartered Market Technician (CMT) with 15 years of experience, you have a keen eye for chart patterns and market trends.",
        tools=[yf_tech_tool],
        llm=llm
    )
    fundamental_analyst = Agent(
        role='Senior Fundamental Analyst',
        goal='Conduct a comprehensive fundamental analysis of {stock_symbol}',
        verbose=True,
        memory=True,
        backstory="With a CFA charter and 15 years of experience in value investing, you dissect financial statements and identify key value drivers.",
        tools=[yf_fundamental_tool],
        llm=llm
    )
    reporter = Agent(
        role='Chief Investment Strategist',
        goal='Synthesize all analyses to create a definitive investment report on {stock_symbol}',
        verbose=True,
        memory=True,
        backstory="As a seasoned investment strategist with 20 years of experience, you weave complex financial data into compelling investment narratives.",
        tools=[reddit_tool, serper_tool, yf_fundamental_tool, yf_tech_tool, YahooFinanceNewsTool()],
        llm=llm
    )
    # Task Definitions
    research_task = Task(
        description=(
            "Conduct research on {stock_symbol}. Your analysis should include:\n"
            "1. Current stock price and historical performance (5 years).\n"
            "2. Key financial metrics (P/E, EPS growth, revenue growth, margins).\n"
            "3. Recent news and press releases (1 month).\n"
            "4. Analyst ratings and price targets (min 3 analysts).\n"
            "5. Reddit sentiment analysis (100 posts).\n"
            "6. Major institutional holders and recent changes.\n"
            "7. Competitive landscape and {stock_symbol}'s market share.\n"
            "Use reputable financial websites for data."
        ),
        expected_output='A detailed 150-word research report with data sources and brief analysis.',
        agent=researcher
    )
    technical_analysis_task = Task(
        description=(
            "Perform technical analysis on {stock_symbol}. Include:\n"
            "1. 50-day and 200-day moving averages (1 year).\n"
            "2. Key support and resistance levels (3 each).\n"
            "3. RSI and MACD indicators.\n"
            "4. Volume analysis (3 months).\n"
            "5. Significant chart patterns (6 months).\n"
            "6. Fibonacci retracement levels.\n"
            "7. Comparison with sector's average.\n"
            "Use the yf_tech_analysis tool for data."
        ),
        expected_output='A 100-word technical analysis report with buy/sell/hold signals and annotated charts.',
        agent=technical_analyst
    )
    fundamental_analysis_task = Task(
        description=(
            "Conduct fundamental analysis of {stock_symbol}. Include:\n"
            "1. Review last 3 years of financial statements.\n"
            "2. Key ratios (P/E, P/B, P/S, PEG, Debt-to-Equity, etc.).\n"
            "3. Comparison with main competitors and industry averages.\n"
            "4. Revenue and earnings growth trends.\n"
            "5. Management effectiveness (ROE, capital allocation).\n"
            "6. Competitive advantages and market position.\n"
            "7. Growth catalysts and risks (2-3 years).\n"
            "8. DCF valuation model with assumptions.\n"
            "Use yf_fundamental_analysis tool for data."
        ),
        expected_output='A 100-word fundamental analysis report with buy/hold/sell recommendation and key metrics summary.',
        agent=fundamental_analyst
    )
    report_task = Task(
        description=(
            "Create an investment report on {stock_symbol}. Include:\n"
            "1. Executive Summary: Investment recommendation.\n"
            "2. Company Snapshot: Key facts.\n"
            "3. Financial Highlights: Top metrics and peer comparison.\n"
            "4. Technical Analysis: Key findings.\n"
            "5. Fundamental Analysis: Top strengths and concerns.\n"
            "6. Risk and Opportunity: Major risk and growth catalyst.\n"
            "7. Reddit Sentiment: Key takeaway from sentiment analysis, including the number of positive, negative and neutral comments and total comments.\n"
            "8. Investment Thesis: Bull and bear cases.\n"
            "9. Price Target: 12-month forecast.\n"
        ),
        expected_output='A 600-word investment report with clear sections, key insights.',
        agent=reporter
    )
    # Crew Definition and Kickoff for Result
    crew = Crew(
        agents=[researcher, technical_analyst, fundamental_analyst, reporter],
        tasks=[research_task, technical_analysis_task, fundamental_analysis_task, report_task],
        process=Process.sequential,
        cache=True
    )
    result = crew.kickoff(inputs={'stock_symbol': stock_symbol})
    os.makedirs('./crew_results', exist_ok=True)
    file_path = f"./crew_results/crew_result_{stock_symbol}.md"
    result_str = str(result)
    with open(file_path, 'w') as file:
        file.write(result_str)
    
    return file_path

我创建了一个交互式区域,允许用户选择模型、集成所需的API密钥,并使用Yahoo Finance数据生成图表。多智能体系统执行所有步骤并在屏幕上显示输出。之后,我将代码上传到GitHub,并将其与Streamlit Cloud集成。为了使这篇文章简明扼要,我在这里总结一下相关内容。如需更详细的代码审查,请私下联系我。

首先,选择你希望使用的模型。

要运行该项目,你需要根据所选择的模型获取API密钥。我已经分享了有关如何访问Groq和OpenAI API的链接。

输入你感兴趣的股票代码进行详细分析,例如(NVDA, AAPL, ACN)

对于交互式图表,选择时间段和指标,然后点击“分析股票”按钮。

然后,该模型将从互联网上收集关于选定股票的数据,并根据所选模型的模板,结合各种参数(包括基本面和技术分析输出及情绪结果)为你提供分析。

输出

Stock Information

Company Name: NVIDIA Corporation
Sector: Technology
Industry: Semiconductors
Country: United States
Current Price: $103.73
Market Cap: $2551581769728
Analysis Result
NVIDIA Corporation (NVDA) Investment Report
1. Executive Summary: Investment Recommendation NVIDIA Corporation (NVDA) is a leading player in the semiconductor industry, particularly known for its graphics processing units (GPUs) that power gaming, AI, and data center applications. Given its robust financial health, strong growth metrics, and competitive positioning, the recommendation is to Hold. While the stock appears oversold, it shows potential for recovery amidst current bearish trends.
2. Company Snapshot: Key Facts
Company Name: NVIDIA Corporation
Sector: Technology
Industry: Semiconductors
Market Cap: $2.55 trillion
P/E Ratio: 60.66
PEG Ratio: 0.89
Debt to Equity Ratio: 22.87
3. Financial Highlights: Top Metrics and Peer Comparison NVIDIA's financial performance is characterized by:
Revenue Growth (YoY): 25.85%
Net Income Growth (YoY): 58.13%
Gross Margin: 75.29%
Operating Margin: 64.93%
Net Profit Margin: 53.40% Compared to peers, NVDA's high P/E ratio indicates strong growth expectations, while its low PEG ratio suggests it may be undervalued relative to its growth potential.
4. Technical Analysis: Key Findings
Current Price: $103.73
20-Day Moving Average: $122.04
50-Day Moving Average: $119.38
RSI: 21.04 (indicating oversold conditions)
MACD: Bearish
Support Level: $102.54
Resistance Level: $136.15 The technical indicators suggest a bearish breakdown, but the stock is currently oversold, indicating a potential for a price rebound.
5. Fundamental Analysis: Top Strengths and Concerns Strengths:
Strong revenue and net income growth rates.
Low debt-to-equity ratio, indicating financial stability.
High gross and operating margins, reflecting efficient operations.
Concerns:
High P/E ratio may indicate overvaluation.
Current bearish trends in the market could impact short-term performance.
6. Risk and Opportunity: Major Risk and Growth Catalyst Major Risk: The semiconductor industry is highly cyclical and sensitive to economic downturns, which could adversely affect NVDA's sales and profitability.
Growth Catalyst: The increasing demand for AI and machine learning applications presents a significant growth opportunity for NVIDIA, as its GPUs are essential for these technologies.
7. Reddit Sentiment: Key Takeaway from Sentiment Analysis From the sentiment analysis conducted on Reddit:
Total Comments Analyzed: 100
Positive Comments: 3
Negative Comments: 6
Neutral Comments: 24 The sentiment is predominantly neutral, with a slight inclination towards negativity, reflecting cautious investor sentiment amidst current market conditions.
8. Investment Thesis: Bull and Bear Cases Bull Case: If NVIDIA continues to capitalize on the growing AI market and maintains its competitive edge, it could see substantial revenue growth, leading to a higher stock price.
Bear Case: Economic headwinds and increased competition in the semiconductor space could lead to declining sales and profitability, negatively impacting the stock price.
9. Price Target: 12-Month Forecast Considering the current market conditions and NVDA's growth potential, a conservative 12-month price target is set at $130, reflecting a recovery from current oversold levels and aligning with historical performance trends.

In conclusion, while NVIDIA Corporation presents a compelling investment opportunity due to its strong fundamentals and growth potential, investors should remain cautious given the current market sentiment and technical indicators. A hold position is recommended until clearer bullish signals emerge.

这不是投资建议,仅代表个人爱好。

保持代理和任务的描述简短而清晰非常重要。冗长的描述可能会超过每分钟令牌数(TPM)的限制,从而导致成本上升。

在创建基本面和技术分析工具之前,我通过从互联网上获取所有数据来完成任务。这导致了很高的处理负载,因此我开发了自定义工具来满足这一需求。这种方法更具成本效益,因此正确评估需求至关重要。

与LLaMA模型相比,GPT模型的推断效果更佳。GPT-4o Mini功能强大,具有更高的TPM限制,相较于GPT-4,它表现同样出色,是一种性价比更高的选择。

感谢阅读!你还可以订阅我们的YouTube频道,观看大量大数据行业相关公开课:https://www.youtube.com/channel/UCa8NLpvi70mHVsW4J_x9OeQ;在LinkedIn上关注我们,扩展你的人际网络!https://www.linkedin.com/company/dataapplab/

原文作者:Batuhan Sener
翻译作者:过儿
美工编辑:过儿
校对审稿:Jason
原文链接:https://medium.com/@batuhansenerr/ai-powered-financial-analysis-multi-agent-systems-transform-data-into-insights-d94e4867d75d