# Sara Duterte's impeachment moves to trial - Rappler

The impeachment trial of Vice President Sara Duterte is one of the most politically charged events in the Philippines this year. News outlets like Rappler, Inquirer net, and Philstar com have been racing to provide minute-by-minute Updates as the Senate prepares for a trial set on July 6. But beyond the political drama, this event offers a fascinating lens into how modern technology-web scraping, natural language processing. And real-time data pipelines-powers the journalism we consume.

For developers and data journalists, the Sara Duterte impeachment trial is a live case study in how software engineering and AI can make sense of complex political processes. From aggregating RSS feeds to building machine-learning models that detect bias, the tech stack behind coverage like Sara Duterte's impeachment moves to trial - Rappler is as intriguing as the trial itself.

In this article, we'll look at the engineering decisions, open-source tools. And data challenges that shape how this story is told. Whether you're a software engineer curious about political tech or a journalist looking to automate your workflow, you'll find practical insights grounded in a real-world event.

---

1. The Tech Behind Real-Time News Aggregation for the Impeachment Trial

When the House of Representatives impeached Sara Duterte on charges of betrayal of public trust, the news broke simultaneously across multiple outlets. Within minutes, Google News aggregated stories from Rappler, Inquirer, and net, BusinessMirror. And Philstarcom. The underlying technology that makes this aggregation possible is a combination of RSS feed parsers, web scrapers, and ranking algorithms.

For developers, building a custom aggregator for Sara Duterte's impeachment moves to trial - Rappler can be done with Python's feedparser library and a lightweight cache like Redis. The challenge is normalizing article bodies from different sources-each news site uses different HTML structures, ad placements. And paywalls. Tools like Newspaper3k help extract clean text, authors. And publish dates.

In production, we found that deduplication becomes critical. The same wire service article may appear on multiple outlets with slight rewrites. A simple SimHash or MinHash algorithm can flag near-duplicate content. This prevents readers from seeing three identical paragraphs about the pre-trial completion.

Laptop screen displaying lines of Python code for web scraping news articles ---

2. How APIs and Web Scraping Keep Us Updated on Sara Duterte's Trial

Most major Philippine news publishers still lack public, developer-friendly APIs. That leaves engineers with two options: reverse-engineer internal APIs (which change frequently) or fall back to HTTP scraping. For a project tracking Sara Duterte's impeachment moves to trial - Rappler, a hybrid approach works best.

Rappler, for instance, serves its articles via a headless CMS with REST endpoints that expose JSON-LD metadata. By inspecting network requests, we can programmatically fetch the latest articles on the impeachment tag. Scrapy, the popular Python framework, can handle this with custom middlewares for rate limiting and user-agent rotation.

However, scraping political news brings ethical and legal considerations. The Robots Exclusion Protocol (robots, and txt) must be respectedFor example, Philstar com's robots, while txt disallows /cmlink/, which might include internal paths. Ignoring such rules could lead to IP bans. Best practice is to set a crawl delay of at least 5 seconds and cache aggressively.

---

Beyond headlines, the actual impeachment complaint-a 30-page PDF-contains dense legal language. Natural Language Processing (NLP) can help journalists and citizens quickly grasp the key accusations. Using spaCy or Stanford CoreNLP, we can perform named entity recognition to extract names, dates, and legal statutes mentioned in the document.

A more advanced application is sentiment analysis on the defense team's statements. For example, Inquirer net reported that "House prosecutors thank Senate after VP Duterte pretrial completion. " A trained classifier could categorize such statements as cooperative or adversarial. Pre-trained models like cardiffnlp/twitter-roberta-base-sentiment can be fine-tuned on a small dataset of Philippine political texts to achieve higher accuracy.

However, local context is crucial. The word "thank" might carry sarcasm in a political feud. We learned that simple bag-of-words models performed poorly-contextual embeddings (BERT) gave much better results, capturing nuances like "thank you for the delay" vs. "thank you for the fairness. And "

---

4Data Integrity and Source Credibility in Political News

In an era of fake news, verifying the source of each claim is paramount. For developers building a dashboard around Sara Duterte's impeachment moves to trial - Rappler, implementing provenance tracking is essential. Each article should be linked to its original URL, author, and publication timestamp, and tools like Mozilla's private relay aren't directly applicable. But the concept of cryptographic hashing of raw content can prevent tampering.

We designed a system where each scraped article's body is stored with its SHA-256 hash. If a news outlet later edits the piece without a correction notice, the hash mismatch alerts users. This is similar to the NewsGuard methodology. But automated.

Another integrity measure is cross-referencing quotes across outlets, and when Philstarcom writes "Senate races to keep VP Sara impeachment trial on track for July 6," and BusinessMirror says the same date, we can correlate them. Mismatches-like one outlet reporting "July 6" and another "July 8"-trigger a manual review flag,

---

5Building a Dashboard to Track Impeachment Developments

A single-page dashboard that aggregates, filters. And visualizes the trial's progress is a valuable tool for journalists and engaged citizens. Using React (or Vue) for the frontend and Node js/Python for the backend, we can create endpoints that serve structured data from the scraped articles.

The key features of such a dashboard include:

  • Timeline view: Events sorted by date, with tags for "pretrial," "hearing," "ruling. "
  • Sentiment graph: A line chart showing positive/negative sentiment across time, colored by source.
  • Entity cloud: A word cloud of the most frequently mentioned people, places, and legal terms.
  • Source comparison: Side-by-side display of how different outlets cover the same event.

We built a prototype using Next, and js and deployed it on VercelThe scraping backend ran on AWS Lambda with S3 for storage. The biggest challenge was handling rate limits-free tiers of news sites often block after 100 requests per minute. We introduced an exponential backoff and proxy rotation (via ScrapingBee) to stay under the radar.

Dashboard screen showing timeline and sentiment analysis of impeachment coverage ---

6. The Role of Open-Source Tools in Monitoring Philippine Politics

The tech community in the Philippines has a tradition of building open-source tools for civic engagement. For the Duterte impeachment, several GitHub repositories appeared within days, offering scripts to scrape Senate announcements and RSS feeds. One popular repo used GitHub Actions to automatically update a JSON file every hour with the latest articles mentioning "Sara Duterte impeachment. "

These tools rely on libraries like requests, beautifulsoup4, pandas. For NLP, transformers from Hugging Face is the go-to. Developers who want to contribute can fork these repos and add features like language detection (Tagalog vs. English) or geolocation of events.

However, maintaining open-source projects for news monitoring is time-consuming. The DOM structures of news sites change, breaking scrapers. One lesson from our experience is to use CSS selectors that target stable attributes like data-testid (if present) instead of fragile class names. Also, writing integration tests that run nightly can catch breakage early,

---

7Challenges: Handling Misinformation and Breaking News Feeds

During high-stakes political events, misinformation spreads fast. Automated systems that aggregate Sara Duterte's impeachment moves to trial - Rappler may accidentally ingest fake news from lesser-known sites. To mitigate this, our pipeline included a whitelist of trusted domains (Rappler, Inquirer, Philstar, BusinessMirror, ABS-CBN News) and a blacklist of known disinformation sources.

Another challenge is the speed of breaking news. When the Senate announced the trial date, all major outlets published within minutes. Our scraper, polling every 10 minutes, would miss the first few seconds. Switching to webhook-based updates (if available) or using a service like Superfeedr can reduce latency.

We also encountered the "hammer effect"-multiple scrapers hitting the same server simultaneously during peak news moments. A staggered schedule and caching layer (Redis with 5-minute TTL) significantly reduced load. For the end user, stale data is acceptable if it means not getting blocked.

---

8. What Software Engineers Can Learn from Covering High-Stakes Politics

Building a system around Sara Duterte's impeachment moves to trial - Rappler teaches software engineers several transferable skills. First, designing for failure: news sites frequently change their HTML structure. A resilient scraper should have fallback selectors and logging to alert developers when extraction fails.

Second, handling unstructured data at scale. Legal PDFs, video transcripts from Senate hearings. And social media statements all require different parsing techniques. Using a microservice architecture where each data type has its own parser (PDF service, HTML service, Twitter API service) keeps the system maintainable.

Third, ethical scraping. Always respect robots txt, add a user-agent identifying your bot and providing contact info. And never republish verbatim copyrighted content. Instead, display summaries or links. This aligns with fair use provisions and keeps you out of legal trouble,

---

9Future of AI-Assisted Journalism in the Philippines

The Duterte impeachment trial is a proving ground for AI-assisted journalism we're already seeing tools that use GPT-4 to generate article summaries from multiple sources, allowing editors to quickly produce roundups. However, hallucination remains a problem-models sometimes invent quotes. And for now, human oversight is mandatory

Another emerging trend is AI-powered fact-checking. Organizations like VERA Files in the Philippines are experimenting with machine learning to flag disputed claims. For the impeachment, a model trained on Senate records could verify whether a quoted statement matches the official transcript.

We also anticipate more personalized news feeds. Instead of a generic "impeachment" tag, readers could select topics like "defense arguments" or "prosecution tactics" and receive curated articles. This requires sophisticated topic modeling (LDA) and user profiling-a challenge that many Filipino startups are already tackling.

Abstract representation of AI analyzing news articles with network graph ---

Frequently Asked Questions

  1. When does Sara Duterte's impeachment trial officially start?
    The Senate has set the trial for July 6, 2025, following the conclusion of pre-trial proceedings. All major outlets, including Rappler and Philstar com, confirm this date.
  2. How can developers access news data about the trial programmatically?
    Most Philippine news sites don't offer official APIs. However, you can use RSS feeds (e g, and, Rappler's RSS at https://wwwrappler. But com/feed/) and web scraping with Python libraries like Scrapy, ensuring compliance with robots txt.
  3. What NLP tools are best for analyzing legal documents from the trial,
    For PDF extraction, use pdfplumber or camelotFor entity recognition, spaCy's en_core_web_lg works well. For sentiment analysis tailored to Philippine politics, fine-tune a BERT model on localized datasets.
  4. Is it legal to scrape news sites for this impeachment coverage?
    Scraping for personal, non-commercial use is generally tolerated but not guaranteed. And always review the site's terms of serviceSome outlets, like BusinessMirror, have explicit restrictions. Use cached data and display only summaries.
  5. Can AI accurately summarize the impeachment hearings?
    Large language models like GPT-4 can produce coherent summaries. But they sometimes hallucinate. For the trial, combine AI summarization with human fact-checking to ensure accuracy.
---

Conclusion

The impeachment trial of Sara Duterte is more than a political event-it is a technical challenge for anyone building tools to monitor, analyze. And understand it. From real-time web scraping to NLP-powered sentiment analysis, the stack behind coverage like Sara Duterte's impeachment moves to trial - Rappler demonstrates how software engineering can democratize information.

Call to action: If you're a developer interested in civic tech, contribute to an open-source project that tracks this trial. Fork a scraper, improve an NLP model, or build a dashboard. The tools we build today will shape how the next generation engages with democracy,

---

What do you think

Do you believe automated news aggregation should be required to display source credibility scores,? Or does that introduce bias?

Is it ethical to use AI-generated summaries of legal proceedings without explicit oversight from a licensed lawyer or journalist?

Should Philippine news outlets be forced to open public APIs for non-commercial developers, similar to the European Union's data access guidelines?

.

Need a Custom App Built?

Let's discuss your project and bring your ideas to life.

Contact Me Today →

Back to Online Trends