Spaces:

MtotoWaJemo
/

nifty-news-analysis

Running

App Files Files Community

mtotowajemo0 commited on 24 days ago

Commit

f045566

1 Parent(s): 30ab83e

Moved Streamlit app files to root directory

Browse files

Files changed (6) hide show

.gitignore +16 -0
LICENSE +21 -0
README.md +57 -13
app.py +293 -0
nifty-news-analysis +0 -1
requirements.txt +6 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,16 @@

+pycache/
+*.pyc
+*.pyo
+*.pyd
+.Python
+env/
+venv/
+.env
+*.log
+*.cache
+*.DS_Store
+init_and_push.txt
+init_and_push.ps1
+*.tmp

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 mtotowajemo0
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,13 +1,57 @@
----
-title: Nifty News Analysis
-emoji: 📈
-colorFrom: red
-colorTo: yellow
-sdk: gradio
-sdk_version: 5.32.1
-app_file: app.py
-pinned: false
-license: mit
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# NIFTY 50 News Analysis
+## Overview
+This project is a Streamlit-based web application that analyzes news sentiment for companies in the NIFTY 50 index, categorized by sectors. It fetches news articles using the NewsAPI, summarizes them using the T5 model, and performs sentiment analysis using DistilBERT. The app provides insights into sector and company sentiment to guide investment decisions.
+## Features
+- **Sector Selection**: Choose from NIFTY 50 sectors (e.g., Financials, Healthcare).
+- **Time Frame Analysis**: Analyze news from different periods (1D, 5D, 1M, 6M, YTD, 1Y, 5Y).
+- **Sentiment Analysis**: Summarizes news and classifies sentiment as Positive, Negative, or Neutral.
+- **Investment Insights**: Provides sentiment scores and recommendations for companies.
+- **Interactive UI**: Built with Streamlit, featuring a user-friendly interface with tables and visualizations.
+## Installation
+1. Clone the repository:
+   ```bash
+   git clone https://github.com/mtotowajemo0/nifty-news-analysis.git
+   ```
+2. Navigate to the project directory:
+   ```bash
+   cd nifty-news-analysis
+   ```
+3. Install dependencies:
+   ```bash
+   pip install -r requirements.txt
+   ```
+## Requirements
+- Python 3.8+
+- Libraries listed in `requirements.txt`:
+  - streamlit
+  - newsapi-python
+  - transformers
+  - streamlit-extras
+  - pandas
+## Usage
+1. Obtain a NewsAPI key from [newsapi.org](https://newsapi.org/).
+2. Replace the `api_key` in `app.py` with your NewsAPI key.
+3. Run the Streamlit app:
+   ```bash
+   streamlit run app.py
+   ```
+4. Open the app in your browser, select a sector and time frame, and click "Analyze News" to view results.
+## Files
+- `app.py`: Main application script with Streamlit code, news fetching, and sentiment analysis.
+- `requirements.txt`: List of Python dependencies.
+- `README.md`: Project documentation (this file).
+## Notes
+- The app uses the `t5-small` model for summarization and `distilbert-base-uncased-finetuned-sst-2-english` for sentiment analysis.
+- News articles are filtered based on a weighted keyword system to ensure relevance.
+- Sentiment scores are calculated as (Positive - Negative) / Total Articles.
+- **Disclaimer**: Insights are for informational purposes only and not financial advice.
+## License
+MIT License. See [LICENSE](LICENSE) for details.

app.py ADDED Viewed

	@@ -0,0 +1,293 @@

+import streamlit as st
+from newsapi import NewsApiClient
+from transformers import pipeline
+from streamlit_extras.colored_header import colored_header
+from datetime import datetime, timedelta
+import pandas as pd
+import uuid
+import os
+# NIFTY 50 companies with tickers and sectors
+nifty_50_data = {
+    "Adani Enterprises": {"ticker": "ADANIENT.NS", "sector": "Industrials"},
+    "Adani Ports": {"ticker": "ADANIPORTS.NS", "sector": "Industrials"},
+    "Apollo Hospitals": {"ticker": "APOLLOHOSP.NS", "sector": "Healthcare"},
+    "Asian Paints": {"ticker": "ASIANPAINT.NS", "sector": "Consumer Discretionary"},
+    "Axis Bank": {"ticker": "AXISBANK.NS", "sector": "Financials"},
+    "Bajaj Auto": {"ticker": "BAJAJ-AUTO.NS", "sector": "Consumer Discretionary"},
+    "Bajaj Finserv": {"ticker": "BAJAJFINSV.NS", "sector": "Financials"},
+    "Bajaj Finance": {"ticker": "BAJFINANCE.NS", "sector": "Financials"},
+    "Bharti Airtel": {"ticker": "BHARTIARTL.NS", "sector": "Communication Services"},
+    "BPCL": {"ticker": "BPCL.NS", "sector": "Energy"},
+    "Britannia": {"ticker": "BRITANNIA.NS", "sector": "Consumer Staples"},
+    "Cipla": {"ticker": "CIPLA.NS", "sector": "Healthcare"},
+    "Coal India": {"ticker": "COALINDIA.NS", "sector": "Energy"},
+    "Divis Labs": {"ticker": "DIVISLAB.NS", "sector": "Healthcare"},
+    "Dr. Reddy's Labs": {"ticker": "DRREDDY.NS", "sector": "Healthcare"},
+    "Eicher Motors": {"ticker": "EICHERMOT.NS", "sector": "Consumer Discretionary"},
+    "Grasim Industries": {"ticker": "GRASIM.NS", "sector": "Materials"},
+    "HCL Technologies": {"ticker": "HCLTECH.NS", "sector": "Information Technology"},
+    "HDFC Bank": {"ticker": "HDFCBANK.NS", "sector": "Financials"},
+    "HDFC Life": {"ticker": "HDFCLIFE.NS", "sector": "Financials"},
+    "Hero MotoCorp": {"ticker": "HEROMOTOCO.NS", "sector": "Consumer Discretionary"},
+    "Hindalco": {"ticker": "HINDALCO.NS", "sector": "Materials"},
+    "HUL": {"ticker": "HINDUNILVR.NS", "sector": "Consumer Staples"},
+    "ICICI Bank": {"ticker": "ICICIBANK.NS", "sector": "Financials"},
+    "IndusInd Bank": {"ticker": "INDUSINDBK.NS", "sector": "Financials"},
+    "Infosys": {"ticker": "INFY.NS", "sector": "Information Technology"},
+    "ITC": {"ticker": "ITC.NS", "sector": "Consumer Staples"},
+    "JSW Steel": {"ticker": "JSWSTEEL.NS", "sector": "Materials"},
+    "Kotak Mahindra Bank": {"ticker": "KOTAKBANK.NS", "sector": "Financials"},
+    "L&T": {"ticker": "LT.NS", "sector": "Industrials"},
+    "L&T Technology Services": {"ticker": "LTIM.NS", "sector": "Information Technology"},
+    "M&M": {"ticker": "M&M.NS", "sector": "Consumer Discretionary"},
+    "Maruti Suzuki": {"ticker": "MARUTI.NS", "sector": "Consumer Discretionary"},
+    "Nestle India": {"ticker": "NESTLEIND.NS", "sector": "Consumer Staples"},
+    "NTPC": {"ticker": "NTPC.NS", "sector": "Utilities"},
+    "ONGC": {"ticker": "ONGC.NS", "sector": "Energy"},
+    "Power Grid": {"ticker": "POWERGRID.NS", "sector": "Utilities"},
+    "Reliance": {"ticker": "RELIANCE.NS", "sector": "Energy"},
+    "SBI Life": {"ticker": "SBILIFE.NS", "sector": "Financials"},
+    "SBI": {"ticker": "SBIN.NS", "sector": "Financials"},
+    "Shriram Finance": {"ticker": "SHRIRAMFIN.NS", "sector": "Financials"},
+    "Sun Pharma": {"ticker": "SUNPHARMA.NS", "sector": "Healthcare"},
+    "Tata Consumer Products": {"ticker": "TATACONSUM.NS", "sector": "Consumer Staples"},
+    "Tata Motors": {"ticker": "TATAMOTORS.NS", "sector": "Consumer Discretionary"},
+    "Tata Steel": {"ticker": "TATASTEEL.NS", "sector": "Materials"},
+    "TCS": {"ticker": "TCS.NS", "sector": "Information Technology"},
+    "Tech Mahindra": {"ticker": "TECHM.NS", "sector": "Information Technology"},
+    "Titan": {"ticker": "TITAN.NS", "sector": "Consumer Discretionary"},
+    "UltraTech Cement": {"ticker": "ULTRACEMCO.NS", "sector": "Materials"},
+    "Wipro": {"ticker": "WIPRO.NS", "sector": "Information Technology"},
+}
+# Streamlit app setup
+st.set_page_config(page_title="NIFTY 50 News Analysis", layout="wide")
+# Custom CSS
+st.markdown("""
+    <style>
+    .stApp {
+        background: linear-gradient(to bottom right, #f0f4f8, #e0e7ff);
+    }
+    .sidebar .sidebar-content {
+        background: linear-gradient(to bottom, #4b5e7e, #7e8aa2);
+        color: white;
+    }
+    .stTable { border: 1px solid #ddd; border-radius: 5px; background: #fff; }
+    .news-container { border: 1px solid #e0e7ff; border-radius: 5px; padding: 10px; margin-bottom: 10px; }
+    </style>
+""", unsafe_allow_html=True)
+# Sidebar controls
+with st.sidebar:
+    st.title("NIFTY 50 News Analysis")
+    st.info("Analyze news sentiment for companies by sector over different time frames.")
+    sectors = sorted(set(data['sector'] for data in nifty_50_data.values()))
+    selected_sector = st.selectbox("Select a Sector", sectors)
+    selected_period = st.selectbox("Select Time Frame", ["1D", "5D", "1M", "6M", "YTD", "1Y", "5Y"], index=2)
+    button = st.button("Analyze News")
+# News-related setup
+newsapi = NewsApiClient(api_key=os.getenv("NEWSAPI_KEY"))
+summarizer = pipeline("summarization", model="t5-small")
+classifier = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
+# Keyword weights
+keyword_weights = {
+    "revenue": 3, "profit": 3, "loss": 3, "earnings": 3, "EBITDA": 3, "quarterly results": 3, "annual report": 3,
+    "share price": 3, "market cap": 3, "dividend": 3, "buyback": 3, "stock split": 3, "bonus issue": 3,
+    "downgrade": 3, "upgrade": 3, "bullish": 3, "bearish": 3, "rating change": 3,
+    "acquisition": 2, "merger": 2, "takeover": 2, "buyout": 2, "new plant": 2, "factory": 2, "expansion": 2,
+    "investment": 2, "launch": 2, "R&D": 2, "deal": 2, "agreement": 2, "MoU": 2, "partnership": 2, "collaboration": 2,
+    "SEBI": 1.5, "fine": 1.5, "violation": 1.5, "compliance": 1.5, "FIR": 1.5, "probe": 1.5, "subsidy": 1.5,
+    "tax": 1.5, "regulation": 1.5, "policy change": 1.5, "license": 1.5, "CEO": 1.5, "CFO": 1.5, "resigns": 1.5,
+    "appointed": 1.5, "stepping down": 1.5, "fraud": 1.5, "scandal": 1.5, "mismanagement": 1.5, "whistleblower": 1.5,
+    "inflation": 1, "GDP": 1, "interest rate": 1, "RBI policy": 1, "sanctions": 1, "trade war": 1, "conflict": 1,
+    "export/import": 1, "recall": 1, "defect": 1, "complaint": 1, "customer issue": 1, "hack": 1, "breach": 1,
+    "cyberattack": 1, "data leak": 1
+}
+# Function to calculate time range based on selected period
+def get_date_range(period):
+    end_date = datetime.now()
+    if period == "1D":
+        start_date = end_date - timedelta(days=1)
+    elif period == "5D":
+        start_date = end_date - timedelta(days=5)
+    elif period == "1M":
+        start_date = end_date - timedelta(days=30)
+    elif period == "6M":
+        start_date = end_date - timedelta(days=180)
+    elif period == "YTD":
+        start_date = datetime(end_date.year, 1, 1)
+    elif period == "1Y":
+        start_date = end_date - timedelta(days=365)
+    else:  # 5Y
+        start_date = end_date - timedelta(days=365 * 5)
+    return start_date.strftime('%Y-%m-%d'), end_date.strftime('%Y-%m-%d')
+@st.cache_data
+def fetch_news(company_name, from_date, to_date, page_size=10):
+    try:
+        articles = newsapi.get_everything(
+            q=company_name,
+            from_param=from_date,
+            to=to_date,
+            language="en",
+            sort_by="publishedAt",
+            page_size=page_size
+        )["articles"]
+        relevant_articles = []
+        for article in articles:
+            title = (article.get("title", "") or "").lower()
+            desc = (article.get("description", "") or "").lower()
+            if any(keyword in title or keyword in desc for keyword in keyword_weights.keys()):
+                article["relevance_weight"] = sum(keyword_weights.get(keyword, 0) for keyword in keyword_weights if keyword in title or keyword in desc)
+                relevant_articles.append(article)
+        return sorted(relevant_articles, key=lambda x: x["relevance_weight"], reverse=True)[:10]
+    except Exception as e:
+        st.warning(f"Error fetching news for {company_name}: {str(e)}")
+        return []
+@st.cache_data
+def summarize_and_classify(news_articles):
+    sentiment_counts = {"Positive": 0, "Negative": 0, "Neutral": 0}
+    summaries = []
+    key_themes = {}
+    for article in news_articles:
+        content = article.get("content", "") or article.get("description", "") or article.get("title", "")
+        if not content:
+            continue
+        summary = summarizer(content[:2048], max_length=100, min_length=30, do_sample=False)[0]["summary_text"] if len(content) > 100 else content
+        sentences = summary.split(". ")
+        key_insight = max(sentences, key=lambda s: sum(keyword_weights.get(k, 0) for k in keyword_weights if k in s.lower()), default=summary)
+        sentiment_result = classifier(summary)[0]
+        sentiment_label = sentiment_result["label"]
+        sentiment_score = sentiment_result["score"]
+        if sentiment_label == "POSITIVE" and sentiment_score > 0.7:
+            sentiment_counts["Positive"] += 1
+            sentiment_display = "Positive"
+        elif sentiment_label == "NEGATIVE" and sentiment_score > 0.7:
+            sentiment_counts["Negative"] += 1
+            sentiment_display = "Negative"
+        else:
+            sentiment_counts["Neutral"] += 1
+            sentiment_display = "Neutral"
+        title = (article.get("title", "") or "").lower()
+        desc = (article.get("description", "") or "").lower()
+        for keyword in keyword_weights:
+            if keyword in title or keyword in desc:
+                key_themes[keyword] = key_themes.get(keyword, 0) + 1
+        summaries.append({
+            "title": article.get("title", "No title"),
+            "summary": summary,
+            "key_insight": key_insight,
+            "sentiment": sentiment_display,
+            "confidence": sentiment_score,
+            "url": article.get("url", "#"),
+            "published_at": article.get("publishedAt", "No date")
+        })
+    top_themes = sorted(key_themes.items(), key=lambda x: x[1], reverse=True)[:3]
+    return sorted(summaries, key=lambda x: x["confidence"], reverse=True)[:5], sentiment_counts, top_themes
+def display_news_articles(news_articles, company_name, selected_period):
+    colored_header(f"Summarized News for {company_name} ({selected_period})",
+                   description=f"Key Updates from the Selected Period",
+                   color_name="green-70")
+    for news in news_articles:
+        with st.container():
+            st.markdown('<div class="news-container">', unsafe_allow_html=True)
+            col1, col2 = st.columns([3, 1])
+            with col1:
+                st.subheader(news['title'])
+                st.write(f"**Summary**: {news['summary']}")
+                st.write(f"**Key Insight**: {news['key_insight']}")
+                st.markdown(f"[Read More]({news['url']})")
+            with col2:
+                if news['sentiment'] == "Positive":
+                    st.markdown(f"<span style='color: green'>🟢 Positive ({news['confidence']*100:.1f}%)</span>", unsafe_allow_html=True)
+                elif news['sentiment'] == "Negative":
+                    st.markdown(f"<span style='color: red'>🔴 Negative ({news['confidence']*100:.1f}%)</span>", unsafe_allow_html=True)
+                else:
+                    st.markdown(f"<span style='color: gray'>⚪ Neutral ({news['confidence']*100:.1f}%)</span>", unsafe_allow_html=True)
+                st.write(f"**Published**: {news['published_at']}")
+            st.markdown('</div>', unsafe_allow_html=True)
+# Main app layout
+st.title("📰 NIFTY 50 Sector News Analysis")
+st.markdown("Analyze news sentiment for companies in a selected sector to guide investment decisions.", unsafe_allow_html=True)
+if button:
+    with st.spinner("Fetching and analyzing news..."):
+        # Get date range
+        from_date, to_date = get_date_range(selected_period)
+        # Filter companies by selected sector
+        companies_in_sector = {name: data for name, data in nifty_50_data.items()
+                             if data['sector'] == selected_sector}
+        # Fetch and analyze news
+        sentiment_data = []
+        all_news = {}
+        sector_sentiment_counts = {"Positive": 0, "Negative": 0, "Neutral": 0}
+        max_articles = 0
+        for company_name in companies_in_sector.keys():
+            news_articles = fetch_news(company_name, from_date, to_date)
+            if news_articles:
+                summarized_news, sentiment_counts, top_themes = summarize_and_classify(news_articles)
+                total_articles = sum(sentiment_counts.values())
+                max_articles = max(max_articles, total_articles)
+                sentiment_score = (sentiment_counts["Positive"] - sentiment_counts["Negative"]) / total_articles if total_articles > 0 else 0
+                dominant_sentiment = max(sentiment_counts, key=sentiment_counts.get)
+                sentiment_data.append({
+                    "Company": company_name,
+                    "Positive": sentiment_counts["Positive"],
+                    "Negative": sentiment_counts["Negative"],
+                    "Neutral": sentiment_counts["Neutral"],
+                    "Total": total_articles,
+                    "Sentiment Score": sentiment_score,
+                    "Dominant Sentiment": dominant_sentiment,
+                    "Top Themes": [theme[0] for theme in top_themes]
+                })
+                all_news[company_name] = summarized_news
+                for sentiment, count in sentiment_counts.items():
+                    sector_sentiment_counts[sentiment] += count
+        # Display results
+        if sentiment_data:
+            colored_header(f"Sentiment Analysis for {selected_sector} Sector ({selected_period})",
+                         description=f"News from {from_date} to {to_date}",
+                         color_name="blue-70")
+            # Single sentiment table
+            sentiment_df = pd.DataFrame(sentiment_data)[["Company", "Positive", "Negative", "Neutral", "Total", "Sentiment Score"]]
+            sentiment_df = sentiment_df.sort_values("Sentiment Score", ascending=False)
+            st.subheader("Company Sentiment Overview")
+            st.table(sentiment_df)
+            # Decision Guidance
+            colored_header("Decision Guidance", description="Investment Insights Based on News Sentiment", color_name="violet-70")
+            st.markdown("**Note**: These insights are based on news sentiment analysis and are not financial advice. Consult a financial advisor.")
+            sector_total = sum(sector_sentiment_counts.values())
+            sector_positive_pct = (sector_sentiment_counts["Positive"] / sector_total * 100) if sector_total > 0 else 0
+            sector_negative_pct = (sector_sentiment_counts["Negative"] / sector_total * 100) if sector_total > 0 else 0
+            sector_neutral_pct = (sector_sentiment_counts["Neutral"] / sector_total * 100) if sector_total > 0 else 0
+            sector_sentiment = "Positive" if sector_positive_pct > 50 else "Negative" if sector_negative_pct > 50 else "Neutral"
+            st.markdown(f"**Sector Sentiment**: {sector_sentiment} ({sector_positive_pct:.1f}% Positive, {sector_negative_pct:.1f}% Negative, {sector_neutral_pct:.1f}% Neutral)")
+            st.markdown(f"- **Investment Outlook**: {'Favorable' if sector_positive_pct > 50 else 'Cautious' if sector_negative_pct > 50 else 'Neutral'} for selective investments in the {selected_sector} sector.")
+            st.markdown("**Company Insights**:")
+            for company in sentiment_data:
+                confidence = "High" if company["Total"] / max_articles > 0.7 else "Medium" if company["Total"] / max_articles > 0.3 else "Low"
+                recommendation = "Consider buying" if company["Sentiment Score"] > 0.3 else "Avoid" if company["Sentiment Score"] < -0.3 else "Monitor"
+                themes_str = ", ".join(company["Top Themes"]) if company["Top Themes"] else "none"
+                st.markdown(f"- **{company['Company']}**: Score: {company['Sentiment Score']:.2f} ({company['Dominant Sentiment']}, driven by {themes_str}), {company['Total']} articles (Confidence: {confidence}). {recommendation}.")
+            # Detailed news for each company
+            for company_name in sentiment_df["Company"]:
+                if company_name in all_news and all_news[company_name]:
+                    display_news_articles(all_news[company_name], company_name, selected_period)
+        else:
+            st.warning(f"No relevant news found for {selected_sector} sector in the selected period.")

nifty-news-analysis DELETED Viewed

	@@ -1 +0,0 @@
1	- Subproject commit 0bad46e406c22516da78dc994b76c60a0693d25c

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+streamlit==1.38.0
+newsapi-python==0.2.7
+transformers==4.44.2
+streamlit-extras==0.4.7
+pandas==2.2.3
+torch