메시지 큐 대 스트리밍 시스템: 주요 차이점 및 사용 사례

Oleksii K. DevOps 엔지니어

Add to my AI research

데이터 처리 및 메시징 시스템의 세계에서 “큐”와 “스트리밍”과 같은 용어가 자주 등장합니다. 이들은 비슷하게 들릴 수 있으나, 각각의 목적은 다르며 시스템이 데이터를 처리하는 방법에 큰 영향을 미칠 수 있습니다. 이러한 차이를 간단히 설명해보겠습니다.

메시지 큐란 무엇인가?

고객이 온라인 또는 직접 주문을 하는 커피숍을 상상해보세요. 주문이 처리되면 고객에게 픽업 알림이 전송됩니다. 이 비유에서, 주문은 큐의 메시지처럼 작동하며, 바리스타는 각각의 주문을 처리하여 완료하면 큐에서 제거합니다. 이것이 바로 메시지 큐가 작동하는 방식입니다.

각 메시지는 독립적으로 처리해야 하는 개별 작업을 나타냅니다. 큐의 메시지는 순서대로 소비되며, 일반적으로 소멸성 소비입니다. 즉, 메시지가 처리되면 큐에서 삭제됩니다.

메시지 큐의 주요 특성:

비동기 통신: 생산자는 소비자가 동시에 준비되지 않아도 메시지를 전송할 수 있습니다. 커피를 주문할 때처럼, 제작되는 동안 옆에 서 있을 필요가 없습니다.
선입선출 (FIFO): 메시지는 수신된 순서대로 처리되며, 이는 은행 거래와 같이 엄격한 순서가 필요한 작업에 중요합니다. 일부 큐는 구성에 따라 비FIFO 처리를 허용할 수 있습니다.
내구성: 메시지는 소비자가 처리할 때까지 안정적으로 저장됩니다. 이는 시스템 장애가 발생해도 메시지가 손실되지 않음을 보장합니다.
독점 수신: 각 메시지는 하나의 소비자 인스턴스에서만 소비되며, 중복 처리가 되지 않도록 보장합니다. 메시지는 소비자가 확인하면 삭제됩니다.

큐의 일반적인 사용 사례:

메시지 큐는 병렬 처리와 확장성이 필요한 상황에 이상적입니다. 예시는 다음과 같습니다:

재고 관리: 실시간으로 재고 수준을 추적하고 업데이트합니다.
의료 시스템: 환자 흐름 관리 및 예약 일정 관리.
식당 운영: 고객 주문 및 예약 관리.

스트리밍 메시지란 무엇인가?

이제 음악이 계속 흐르는 라이브 콘서트를 상상해보세요. 스트리밍 메시지는 데이터의 지속적인 흐름과 실시간 처리를 중심으로 합니다.

스트리밍 메시지의 주요 특성:

실시간 처리: 스트리밍 메시지는 생성과 동시에 소비되며, 이는 스트리밍 서비스에서 음악을 듣는 것과 유사합니다.
이벤트 기반 아키텍처: 데이터는 이용 가능해지면 즉시 소비자에게 전달되어 즉각적인 반응을 가능하게 합니다. 예를 들어, 소셜 미디어 피드는 새로운 게시물, 좋아요 및 댓글로 동적으로 업데이트됩니다.
확장성: 스트리밍 시스템은 방대한 양의 데이터를 처리할 수 있어, 실시간 분석, 모니터링 및 기계 학습에 적합합니다.
메시지 보존: 메시지는 지정된 기간 동안 저장되며, 배치 처리 또는 오류 복구를 위해 재생될 수 있습니다. 보존은 시간(예: 7일) 또는 크기(예: 파티션당 1GB)에 기반합니다.

스트리밍의 일반적인 사용 사례:

스트리밍은 현대 생활의 필수 요소이며, 애플리케이션을 작동시킵니다. 예시는 다음과 같습니다:

주식 가격 모니터링: 거래자에게 실시간 업데이트 제공.
부정행위 탐지: 수상한 활동을 즉시 식별합니다.
고객 서비스 분석: 실시간으로 상호작용 및 감정 추적.

Apache Kafka에서 큐를 사용하는 이유?

Confluent에서는 다양한 데이터 워크로드를 위해 Apache Kafka를 보편적인 솔루션으로 만들어 독점 시스템에 대한 의존성을 제거하려고 합니다. 전통적인 메시징 시스템은 사용자가 순서와 속도 중에서 선택해야 할 때가 많습니다. Kafka는 이제 큐 지원을 도입하여 이 갭을 메우고 사용자에게 메시지를 순차적 또는 병렬로 처리할 유연성을 제공합니다.

이 추가 기능은 Kafka의 다기능성을 향상시켜 스트리밍 및 큐 기반 워크플로를 모두 지원하여 더욱 다양한 사용 사례에 대응할 수 있게 합니다.

Apache Kafka에서 큐는 어떻게 지원되는가?

Kafka는 각 메시지에 고유한 오프셋을 할당하는 로그 기반 아키텍처를 사용합니다. 소비자는 메시지를 순차적으로 읽으며, 이는 내결함성과 메시지 재생을 가능하게 합니다. 새로운 하이브리드 모델을 통해 Kafka는 전통적인 큐와 로그 기반 설계의 장점을 결합합니다:

병렬 처리: 여러 소비자가 동시에 메시지를 소비할 수 있습니다.
재생 가능성: 메시지는 복구 또는 재처리를 위해 재생될 수 있습니다.
높은 처리량: Kafka는 필요에 따라 비순차 처리도 가능하게 하며, 확장성과 신뢰성을 유지합니다.

Kafka의 소비자 그룹과 공유 그룹

Kafka에서는 소비자 그룹이 토픽에서 데이터를 소비하는 방법을 관리합니다. 각 소비자 그룹은 토픽의 파티션을 읽는 여러 소비자로 구성됩니다. 그룹 내에는 파티션과 소비자 간에 1:1 관계가 있습니다. 그러나 소비자 수가 파티션 수를 초과하면 확장이 비효율적일 수 있습니다.

공유 그룹은 전통적인 큐 시스템과 유사한 작업량에 더 유연한 접근 방식을 제공합니다. 여러 소비자가 같은 파티션을 읽어 데이터 공유 및 처리에 대한 더 세밀한 제어를 가능하게 합니다.

공유 그룹의 주요 특징:

병렬 읽기: 공유 그룹의 여러 소비자가 동일한 파티션을 읽을 수 있습니다.
동적 확장: 토픽을 다시 분할할 필요 없이 더 많은 소비자를 추가하여 최대 부하 처리를 할 수 있습니다.
개별 확인: 메시지는 하나씩 확인되어, 배치 처리를 최적화하면서 처리되지 않은 메시지의 재전송을 가능하게 합니다.
독립적인 소비: 다른 공유 그룹의 소비자는 간섭 없이 동일한 토픽에 접근할 수 있습니다.

공유 그룹은 순서를 보장하는가?

완전히 그렇지는 않습니다. 배치 내에서는 기록이 오프셋 순서로 있지만, 배치 간 주문은 보장되지 않습니다. 예를 들어, 소비자가 배치 중간에 장애가 발생하면 다른 소비자가 이후 메시지를 먼저 처리할 수 있어, 배치 간 배달 순서가 뒤바뀔 수 있습니다.

실제 사례: 소매 판매 행사

대규모 판매 행사를 주최하는 소매업체를 고려해보십시오. 결제 시스템은 급증하는 주문을 효율적으로 처리해야 합니다. 공유 그룹을 통해:

병렬 처리: 주문은 병렬 처리하기 위해 여러 작업자에게 배분됩니다.
동적 자원 할당: 시스템은 최고 부하 시 소비자를 추가하고, 수요가 적을 때 축소할 수 있습니다.
효율적 처리: 엄격한 순서가 필요하지 않은 빠른 주문 처리.

이러한 유연성은 시스템이 변동하는 작업량에 매끄럽게 적응하여 고객 만족도와 자원 최적화를 보장합니다.

Name	Descripiton
PHPSESSID	Preserves user session state across page requests. Cookie generated by applications based on the PHP language. This is a general purpose identifier used to maintain user session variables. It is normally a random generated number, how it is used can be specific to the site, but a good example is maintaining a logged-in status for a user between pages.
sp_i	Used to store information about authenticated User.
sp_r	Used to store information about authenticated User.
sp_a	Used to store information about authenticated User.

Name	Descripiton
tuuid	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
tuuid_last_update	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
um	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
umeh	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded.
na_sc_x	Used by the social sharing platform AddThis to keep a record of parts of the site that has been visited in order to recommend other parts of the site.
APID	Collects anonymous data related to the user's visits to the website.
IDSYNC	Collects anonymous data related to the user's visits to the website.
_cc_aud	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
_cc_cc	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
_cc_dc	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
_cc_id	Collects anonymous statistical data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The purpose is to segment the website's users according to factors such as demographics and geographical location, in order to enable media and marketing agencies to structure and understand their target groups to enable customised online advertising.
dpm	Via a unique ID that is used for semantic content analysis, the user's navigation on the website is registered and linked to offline data from surveys and similar registrations to display targeted ads.
acs	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded, with the purpose of displaying targeted ads.
clid	Collects anonymous data related to the user's visits to the website, such as the number of visits, average time spent on the website and what pages have been loaded, with the purpose of displaying targeted ads.
KRTBCOOKIE_#	Registers a unique ID that identifies the user's device during return visits across websites that use the same ad network. The ID is used to allow targeted ads.
PUBMDCID	Registers a unique ID that identifies the user's device during return visits across websites that use the same ad network. The ID is used to allow targeted ads.
PugT	Registers a unique ID that identifies the user's device during return visits across websites that use the same ad network. The ID is used to allow targeted ads.
ssi	Registers a unique ID that identifies a returning user's device. The ID is used for targeted ads.
_tmid	Registers a unique ID that identifies the user's device upon return visits. The ID is used to target ads in video clips.
wam-sync	Used by the advertising platform Weborama to determine the visitor's interests based on pages visits, content clicked and other actions on the website.
wui	Used by the advertising platform Weborama to determine the visitor's interests based on pages visits, content clicked and other actions on the website.
AFFICHE_W	Used by the advertising platform Weborama to determine the visitor's interests based on pages visits, content clicked and other actions on the website.
B	Collects anonymous data related to the user's website visits, such as the number of visits, average time spent on the website and what pages have been loaded. The registered data is used to categorise the users' interest and demographical profiles with the purpose of customising the website content depending on the visitor.
1P_JAR	These cookies are used to gather website statistics, and track conversion rates.
APISID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
HSID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
NID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
SAPISID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
SID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
SIDCC	Security cookie to protect users data from unauthorised access.
SSID	Google set a number of cookies on any page that includes a Google reCAPTCHA. While we have no control over the cookies set by Google, they appear to include a mixture of pieces of information to measure the number and behaviour of Google reCAPTCHA users.
__utmx	This cookie is associated with Google Website Optimizer, a tool designed to help site owners improve their wbesites. It is used to distinguish between two varaitions a webpage that might be shown to a visitor as part of an A/B split test. This helps site owners to detemine which version of a page performs better, and therefore helps to improve the website.
__utmxx	This cookie is associated with Google Website Optimizer, a tool designed to help site owners improve their wbesites. It is used to distinguish between two varaitions a webpage that might be shown to a visitor as part of an A/B split test. This helps site owners to detemine which version of a page performs better, and therefore helps to improve the website.

Name	Descripiton
_hjid	Hotjar cookie. This cookie is set when the customer first lands on a page with the Hotjar script. It is used to persist the random user ID, unique to that site on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.
_hjIncludedInSample	This cookie is associated with web analytics functionality and services from Hot Jar, a Malta based company. It uniquely identifies a visitor during a single browser session and indicates they are included in an audience sample.
intercom-id-[xxx]	This cookie is used by Intercom as a session so that users can continue a chat as they move through the site.
intercom-session-[xxx]	Used to keeping track of sessions and remember logins and conversations.
demdex	Via a unique ID that is used for semantic content analysis, the user's navigation on the website is registered and linked to offline data from surveys and similar registrations to display targeted ads.
CookieConsent	Stores the user's cookie consent state for the current domain.
__cfduid	Used by the content network, Cloudflare, to identify trusted web traffic.
ss	These cookies enable the website to provide enhanced functionality and personalisation . They may be set by us or by third party providers whose services we have added to our pages. These services may include the Live Chat facility, Contact Us form(s), the Product Quotation forms and submission process, and the Email Newsletter sign up functionality .

Name	Descripiton
_ga	This cookie name is asssociated with Google Universal Analytics - which is a significant update to Google's more commonly used analytics service. This cookie is used to distinguish unique users by assigning a randomly generated number as a client identifier. It is included in each page. Registers a unique ID that is used to generate statistical data on how the visitor uses the website. request in a site and used to calculate visitor, session and campaign data for the sites analytics reports. By default it is set to expire after 2 years, although this is customisable by website owners.
_gat	Used by Google Analytics to throttle request rate. This cookie name is associated with Google Universal Analytics, according to documentation it is used to throttle the request rate - limiting the collection of data on high traffic sites. It expires after 10 minutes.
_gid	This cookie name is asssociated with Google Universal Analytics. This appears to be a new cookie and as of Spring 2017 no information is available from Google. It appears to store and update a unique value for each page visited. Registers a unique ID that is used to generate statistical data on how the visitor uses the website.
IDE	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
r/collect	Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.
test_cookie	Used to check if the user's browser supports cookies.
collect	Used to send data to Google Analytics about the visitor's device and behaviour. Tracks the visitor across devices and marketing channels.
ads/user-lists/#	These cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites.
c	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
khaos	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
put_#	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
rpb	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
rpx	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.
tap.php	Registers anonymised user data, such as IP address, geographical location, visited websites, and what ads the user has clicked, with the purpose of optimising ad display based on the user's movement on websites that use the same ad network.