뉴욕타임스 API를 활용한 뉴스 저장 및 조회 시스템

프로젝트

뉴욕타임스 API를 활용한 뉴스 저장 및 조회 시스템

content0474 2025. 1. 10. 14:45

문제 발생 상황

NYT(New York Times) API를 사용하여 카테고리별로 뉴스를 저장하고 조회하는 기능을 구현하고 있었음. 하지만 카테고리 데이터의 불일치 문제로 인해 뉴스가 올바르게 조회되지 않음

문제점
카테고리 매핑 불일치: categories 딕셔너리에서 "entertainment"는 "arts"로 매핑되지만, 데이터베이스에서는 "Art"로 정의되어 있어 매칭되지 않음
카테고리의 대소문자 차이
하드코딩된 딕셔너리

기존 코드
기존 코드에서는 카테고리를 하드코딩된 딕셔너리에서 가져와 처리했음

categories = {
    "world": "world",
    "technology": "technology",
    "business": "business",
    "science": "science",
    "health": "health",
    "politics": "politics",
    "entertainment": "arts",
    "sport": "sports"
}

@shared_task
def fetch_and_store_news(category):
    API_KEY = config("NYT_API_KEY")
    section = categories.get(category, "home")
    url = f"https://api.nytimes.com/svc/topstories/v2/{section}.json"
    params = {'api-key': API_KEY}

    response = requests.get(url, params=params)
    if response.status_code == 200:
        data = response.json()
        articles = data.get('results', [])
        print(f"Fetching articles for category: {category}")
        
        for article in articles[:5]:  
            news_id = article.get('url', 'No URL')  
            redis_client.set(
                f"news:{news_id}",
                json.dumps({
                    'title': article.get('title', 'No Title'),
                    'abstract': article.get('abstract', 'No Abstract'),
                    'url': article.get('url', 'No URL'),
                    'published_date': article.get('published_date', 'No Date'),
                    'category': category
                })
            )
    elif response.status_code == 429:
        print(f"Too many requests for category: {category}. Please wait.")
        time.sleep(10)  
    else:
        print(f"Failed to fetch articles for category: {category}, Status Code: {response.status_code}")

수정 방향
데이터베이스와 동기화
하드코딩된 딕셔너리를 제거하고, 카테고리 정보를 데이터베이스에서 동적으로 가져오는 방식으로 변경

대소문자 불일치 해결
데이터베이스에서 카테고리를 조회할 때 대소문자를 무시하도록 name__iexact를 사용

API 섹션 변경 방식 수정
카테고리 이름을 소문자로 변환하여 API 섹션 이름으로 활용하도록

@shared_task
def fetch_and_store_news(category_name):
    API_KEY = config("NYT_API_KEY")

    # 데이터베이스에서 카테고리 매핑
    category = Category.objects.filter(name__iexact=category_name).first()
    if not category:
        print(f"Category '{category_name}' does not exist in the database.")
        return

    # API 섹션 결정
    section = category.name.lower()  # 카테고리 이름을 소문자로 변환하여 사용
    url = f"https://api.nytimes.com/svc/topstories/v2/{section}.json"
    params = {'api-key': API_KEY}

    response = requests.get(url, params=params)
    if response.status_code == 200:
        data = response.json()
        articles = data.get('results', [])
        print(f"Fetching articles for category: {category_name}")
        
        for article in articles[:5]:  
            news_id = article.get('url', 'No URL')  
            redis_client.set(
                f"news:{category_name}:{news_id}",
                json.dumps({
                    'title': article.get('title', 'No Title'),
                    'abstract': article.get('abstract', 'No Abstract'),
                    'url': article.get('url', 'No URL'),
                    'published_date': article.get('published_date', 'No Date'),
                    'category': category_name
                }),
                ex=86400  
            )
    elif response.status_code == 429:
        print(f"Too many requests for category: {category_name}. Please wait.")
        time.sleep(10)  
    else:
        print(f"Failed to fetch articles for category: {category_name}, Status Code: {response.status_code}")

변경 결과
데이터베이스에 정의된 카테고리와 동기화되어 저장과 조회가 정상적으로 이루어짐
대소문자 문제도 해결되어 모든 카테고리에 대해 동일한 방식으로 처리 가능
Redis에 저장된 뉴스 데이터도 조회됨

'프로젝트' 카테고리의 다른 글

too many request (0)	2025.01.14
언론사 카테고리의 확장 (0)	2025.01.13
django 소셜로그인 시 url 패턴 오류 (0)	2025.01.09
Django에서 Social Account 연동 시 IntegrityError 해결 방법 (0)	2025.01.08
Django 오류 : Apps aren't loaded yet (0)	2025.01.07

현재글뉴욕타임스 API를 활용한 뉴스 저장 및 조회 시스템

content0474 님의 블로그

content0474 님의 블로그 입니다.

Today :
Yesterday :

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

content0474 님의 블로그