From RSS to Telegram and X: how to automate a publication flow without losing editorial control
Tutorial to automate content publishing from RSS to Telegram and X, with human validation and real editorial control.

I have a Telegram channel and an X account where I publish curated technical content. For a while I did everything by hand: I read articles, selected the ones I found interesting, wrote a brief comment and published them. It worked, but it ate up between 30 and 45 minutes every day just on that. And the worst part was not the time: it was the inconsistency. On days when I did not feel like it, I did not publish. And a channel that publishes irregularly loses its audience fast.
So I built an automated system. Not to publish on its own, but to prepare everything so I only had to approve or discard. That difference is important: I am not looking for a bot that spams links. I am looking for an assistant that saves me the mechanical part and lets me focus on editorial judgment.
This article explains the full flow I use, based on what I built for Rolsfera. If you manage a technical content channel and want to automate without losing control, this will save you quite a few hours of trial and error.
The full flow
Before diving into each piece, this is the end-to-end flow:
RSS Feeds → Parseo → Filtrado → Resumen (IA) → Cola de revisión → Aprobación humana → Formateo → Publicación (Telegram / X)It looks simple. And conceptually it is. The real complexity lies in the details of each step: how you filter, how you summarize, how you format for each platform, how you handle errors and duplicates.
Let us go piece by piece.
Step 1: Collecting content from RSS
The entry point is RSS feeds. I have a list of about 40 sources I follow: technical blogs, specialized media, newsletters that publish via RSS, GitHub repos with release feeds and a few subreddits via RSS.
# feeds.py - Lista de fuentes con metadatos
FEEDS = [
{
"url": "https://blog.pragmaticengineer.com/rss/",
"name": "Pragmatic Engineer",
"category": "engineering",
"priority": "high",
},
{
"url": "https://martinfowler.com/feed.atom",
"name": "Martin Fowler",
"category": "architecture",
"priority": "high",
},
{
"url": "https://news.ycombinator.com/rss",
"name": "Hacker News",
"category": "general",
"priority": "medium",
},
# ... 37 fuentes más
]Each source has a category and a priority. The priority is not arbitrary: I adjust it based on the history of articles I end up approving from each source. If 80% of what Pragmatic Engineer publishes seems publishable to me, it is high. If from Hacker News I only publish 10%, it is medium.
Parsing is done with feedparser in Python. n8n triggers the process every 30 minutes through a cron trigger that calls an HTTP endpoint on my service.
import feedparser
from datetime import datetime, timedelta
def fetch_new_articles(feed_url: str, since_hours: int = 2) -> list[dict]:
feed = feedparser.parse(feed_url)
cutoff = datetime.utcnow() - timedelta(hours=since_hours)
articles = []
for entry in feed.entries:
published = entry.get("published_parsed")
if published:
pub_date = datetime(*published[:6])
if pub_date < cutoff:
continue
articles.append({
"title": entry.get("title", "").strip(),
"url": entry.get("link", ""),
"summary": entry.get("summary", ""),
"published": entry.get("published", ""),
})
return articlesStep 2: Filtering and deduplication
Not everything that comes in is worth processing. Filtering has two levels:
Deduplication. If the same article is already in the database (by URL or content hash), it gets discarded. This is critical because many feeds share the same news.
Basic relevance filtering. Before spending AI tokens, I apply simple filters:
# Palabras clave que indican contenido relevante para mi audiencia
INCLUDE_KEYWORDS = [
"python", "backend", "api", "architecture", "kubernetes",
"database", "automation", "scraping", "llm", "self-hosted",
"devops", "data engineering", "microservices",
]
# Contenido que normalmente descarto
EXCLUDE_PATTERNS = [
"sponsored", "advertisement", "podcast episode",
"weekly roundup", # demasiado genérico
]
def passes_basic_filter(article: dict) -> bool:
text = f"{article['title']} {article['summary']}".lower()
for pattern in EXCLUDE_PATTERNS:
if pattern in text:
return False
for keyword in INCLUDE_KEYWORDS:
if keyword in text:
return True
return False # si no coincide con nada relevante, no pasaThis filtering is rough. I know. But it reduces the volume of articles reaching the AI step by 60-70%, which has a direct impact on cost and processing time.
Step 3: Summarization with AI
Articles that pass the filter are sent to an LLM to generate a short summary and a classification. The prompt is designed to get exactly what I need to decide whether to publish:
def generate_summary(article: dict) -> dict:
prompt = f"""Eres un editor técnico. Analiza este artículo y responde en JSON:
Título: {article['title']}
Contenido: {article['content'][:2500]}
Responde con:
{{
"summary": "Resumen de 2-3 frases, directo, sin relleno",
"topic": "Tema principal (ej: Python, DevOps, arquitectura)",
"is_actionable": true/false, // ¿aporta algo práctico?
"suggested_comment": "Frase de 1 línea que usaría al compartirlo"
}}
NO incluyas frases genéricas tipo 'Este artículo explora...'
Sé directo y concreto."""
response = call_llm(prompt, model="gpt-4o-mini")
return json.loads(response)I use gpt-4o-mini for this task because I do not need the most powerful model. A 2-3 sentence summary and a classification is something small models do well. I reserve larger models for when I need to generate original content or do more complex analysis.
The prompt matters more than it seems. Adding “DO NOT include generic phrases” was the difference between getting useful summaries and getting filler like “This interesting article examines…”.
Step 4: Review queue
Processed articles arrive in a queue where I review them. In practice it is a PostgreSQL table with a minimal web interface on top:
SELECT
a.title,
a.url,
a.ai_metadata->>'summary' AS summary,
a.ai_metadata->>'suggested_comment' AS comment,
a.ai_metadata->>'topic' AS topic,
a.source_name,
a.created_at
FROM articles a
WHERE a.status = 'pending'
ORDER BY
CASE
WHEN a.source_priority = 'high' THEN 1
WHEN a.source_priority = 'medium' THEN 2
ELSE 3
END,
a.created_at DESC;The review takes me between 5 and 10 minutes a day. Articles from high-priority sources appear first. For each one, I do one of three things: approve (sometimes editing the AI-suggested comment), discard or save for later.
Step 5: Platform-specific formatting
Each platform has its own rules. What works on Telegram does not work on X and vice versa.
Telegram allows long messages, Markdown, emojis and links with preview. My typical format:
def format_for_telegram(article: dict) -> str:
comment = article["ai_metadata"]["suggested_comment"]
title = article["title"]
url = article["url"]
topic = article["ai_metadata"]["topic"]
return f"""🔗 *{title}*
{comment}
📌 Tema: {topic}
👉 [Leer artículo]({url})"""X (Twitter) has a 280-character limit. The format is more compact:
def format_for_x(article: dict) -> str:
comment = article["ai_metadata"]["suggested_comment"]
url = article["url"]
topic = article["ai_metadata"]["topic"]
# X cuenta caracteres, los URLs ocupan ~23
max_comment_len = 280 - 23 - len(f" #{topic}") - 5
if len(comment) > max_comment_len:
comment = comment[:max_comment_len - 3] + "..."
return f"{comment} #{topic} {url}"Step 6: Publishing
Publishing is managed by n8n. When I mark an article as approved, its status changes in the database. An n8n workflow running every 5 minutes detects approved articles and publishes them.
The flow in n8n is simple:
- Cron trigger every 5 minutes
- HTTP Request to the endpoint that returns approved articles
- Split In Batches to avoid publishing everything at once
- Telegram Node to send to the channel
- HTTP Request to the X API to post the tweet
- HTTP Request to update the status to
published
The “not publishing everything at once” part is important. If I approve 8 articles in the morning and they all get published simultaneously, the experience for followers is bad. I use a time slot system: maximum 2 publications per hour, with a minimum of 20 minutes between them.
# Lógica de scheduling simplificada
from datetime import datetime, timedelta
def get_next_publish_slot(last_published_at: datetime) -> datetime:
min_gap = timedelta(minutes=20)
next_slot = last_published_at + min_gap
# No publicar entre las 23:00 y las 08:00
if next_slot.hour >= 23 or next_slot.hour < 8:
next_slot = next_slot.replace(hour=8, minute=0, second=0)
if next_slot <= last_published_at:
next_slot += timedelta(days=1)
return next_slotWhere to use n8n and where to use Python
A question that came up early on was how far to take the logic in n8n and when to move it to Python. After several iterations, my rule is this:
| Task | Tool | Why |
|---|---|---|
| Orchestration (triggers, scheduling) | n8n | Visual interface, easy to modify |
| Simple external API calls | n8n | Native nodes for Telegram, HTTP |
| RSS parsing | Python | feedparser is more robust than n8n’s RSS node |
| Filtering and deduplication | Python | Complex logic with DB access |
| AI processing | Python | Fine control over prompts and responses |
| Content formatting | Python | Template and truncation logic |
| Monitoring and alerts | n8n | Visual error handling, easy to debug |
The general rule: n8n for orchestration, Python for processing. When you try to put complex processing logic into n8n nodes, you end up with an unreadable workflow. And when you try to replicate n8n’s visual orchestration in code, you end up reinventing a scheduler.
Common errors (and how I handle them)
Rate limits
Both the Telegram and X APIs have request limits. Telegram is quite permissive with bots, but X is strict. My solution: a queue system with exponential retries.
import time
def publish_with_retry(publish_fn, content, max_retries=3):
for attempt in range(max_retries):
try:
return publish_fn(content)
except RateLimitError as e:
wait_time = (2 ** attempt) * 30 # 30s, 60s, 120s
print(f"Rate limit. Reintentando en {wait_time}s...")
time.sleep(wait_time)
raise PublishError(f"Fallo tras {max_retries} intentos")Duplicates that slip through
Sometimes the same article appears in multiple feeds with slightly different URLs (with UTM parameters, for example). The solution is to normalize URLs before comparing:
from urllib.parse import urlparse, urlunparse, parse_qs, urlencode
def normalize_url(url: str) -> str:
parsed = urlparse(url)
# Eliminar parámetros de tracking
params = parse_qs(parsed.query)
clean_params = {
k: v for k, v in params.items()
if not k.startswith("utm_")
}
clean_query = urlencode(clean_params, doseq=True)
return urlunparse(parsed._replace(query=clean_query, fragment=""))Broken formatting in posts
Markdown that renders poorly in Telegram, tweets that exceed the character limit, links that do not generate a preview. I discovered these over time. The solution was to add validations before publishing:
def validate_telegram_message(text: str) -> bool:
if len(text) > 4096:
return False
# Verificar que el Markdown está balanceado
if text.count("*") % 2 != 0:
return False
if text.count("_") % 2 != 0:
return False
return TrueBanned bots
It happened to me once with X. I was publishing too fast and the account was temporarily suspended. Since then: maximum 10 publications per day, with hourly distribution, and I never publish at exactly regular intervals (I add a random jitter of 1-5 minutes to avoid looking like a bot).
Human review: why I do not make it fully automatic
I could remove the review step and let the system publish everything that passes the filters. Technically it is trivial. But I do not do it for three reasons:
AI makes mistakes. Not always, but enough that a channel without supervision ends up publishing irrelevant or misclassified content. A summary that looks correct might be taking the article’s conclusion out of context.
Editorial judgment is what differentiates the channel. Anyone can set up a bot that publishes everything from Hacker News. What makes a content channel valuable is that someone with judgment has decided it is worth sharing.
It keeps me connected to the content. If I automate everything, I lose touch with what is being published in my field. The daily 10-minute review is also my quick reading session.
Automating is not about eliminating the human. It is about freeing the human from mechanical tasks so they can focus on what adds value: judgment.
Real costs of the system
To keep things concrete:
| Item | Approx. monthly cost |
|---|---|
| VPS (n8n + services) | ~10 EUR |
| LLM API (summaries, classification) | ~15-25 EUR |
| X API (basic tier) | 0 EUR (free tier) |
| Telegram Bot API | 0 EUR |
| PostgreSQL (self-hosted on VPS) | 0 EUR (included in VPS) |
| Total | ~25-35 EUR/month |
It is not free, but it is less than what any SaaS content management tool subscription costs. And the system is mine, I control it and I can modify it without depending on a company changing their prices or shutting down their API.
Conclusion
Building this flow took me about two weeks of intermittent work. The first version was much simpler: a Python script that read RSS, generated a summary and sent it to Telegram. Fully automatic, no review. The result was mediocre: irrelevant posts, duplicates and a generic tone that did not represent my judgment.
The current version is better because it accepts that total automation is not the goal. The goal is to reduce mechanical work to a minimum and let the editorial decision remain human.
If you are thinking about building something similar, my advice is to start with the simplest step (RSS to Telegram, no AI, no complex filtering) and add layers as you need them. The complexity of this system did not come from a brilliant initial design, but from solving real problems one week after another.


