Using WebSocket with Python
What is WebSocket?
WebSocket is a protocol that enables real-time, bidirectional communication between a browser and a server. Traditional HTTP communication involves the client sending a request and the server responding to exchange data. In contrast, with WebSocket, once the initial connection is established, both the client and the server can send and receive messages to each other without needing to repeatedly establish new connections.
Recently, interactive services like the OpenAI Realtime API and Hume AI have become more common, leading to an anticipated increase in demand for WebSocket. This article introduces the basics of how to use WebSocket, along with a look into related asynchronous processing.
Using WebSocket with Python
In Python, you can use WebSocket as shown below:
import asyncio
import websockets
uri = "ws://..."
async def hello():
async with websockets.connect(uri) as websocket:
await websocket.send("Hello, Server!")
response = await websocket.recv()
print(f"Server says: {response}")
asyncio.run(hello())
- Connect to the WebSocket server using
websockets.connect(uri)
. - Send a message with
websocket.send(message)
. - Receive a message using
websocket.recv()
.
Asynchronous Processing
The async
and await
used in the previous code represent asynchronous processing. Asynchronous processing is especially effective when executing multiple tasks simultaneously.
import asyncio
async def task1():
print("Task 1: Start")
await asyncio.sleep(2) # Wait for 2 seconds
print("Task 1: End")
async def task2():
print("Task 2: Start")
await asyncio.sleep(1) # Wait for 1 second
print("Task 2: End")
async def main():
await asyncio.gather(task1(), task2())
asyncio.run(main())
In functions that use await
, other tasks can run while waiting for the completion of the current task. This allows for efficient switching between tasks.
Asynchronous Processing and Multithreading
Multithreading also handles multiple tasks, but there is a difference in how threads are utilized:
- In multithreading, each task has its own thread, and the program switches between tasks while waiting for certain processes to complete.
- Asynchronous processing, on the other hand, does not create new threads but switches between tasks.
Multithreading is effective when working with CPU-intensive or blocking operations. However, it has drawbacks such as overhead from thread switching (context switching) and increased memory consumption.
In contrast, asynchronous processing reduces the overhead from context switching because it doesn’t rely on threads. However, if a heavy task is running, other tasks may have to wait. As such, it is suitable for IO-bound operations like API requests.
(For tasks that are computationally intensive or require precise timing, multiprocessing is often more effective. Unlike multithreading, multiprocessing allows multiple tasks to run simultaneously.)
For example, when using the OpenAI Realtime API to receive audio from a microphone in real-time and send the audio data to the API, you can use a combination of multithreading and asynchronous processing:
import asyncio
import threading
import queue
import pyaudio
import websockets
# Use a queue to share data between threads
audio_queue = queue.Queue()
# Thread to capture audio using PyAudio
def audio_stream():
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paInt16,
channels=1,
rate=44100,
input=True,
frames_per_buffer=1024)
print("Start recording...")
while True:
data = stream.read(1024)
audio_queue.put(data)
# Asynchronous function to send audio data via WebSocket
async def send_audio():
uri = "ws://localhost:8765"
async with websockets.connect(uri) as websocket:
while True:
# Get audio data from the queue
data = audio_queue.get()
if data is None:
break
await websocket.send(data)
print("Sent audio data")
# Start the audio capture thread and run the asynchronous task
def main():
audio_thread = threading.Thread(target=audio_stream)
audio_thread.start()
# Run the WebSocket sending task
asyncio.run(send_audio())
if __name__ == "__main__":
main()
The audio capture process is a blocking operation, so it is executed in a separate thread using threading
. In contrast, sending the audio data, which involves IO-bound operations like interacting with an API, is done using asynchronous processing. (Note: PyAudio can also be run non-blocking using callbacks. )
Conclusion
In this article, we introduced WebSocket and asynchronous processing.
I found these concepts particularly confusing while working with the OpenAI Realtime API, so I put this together as a personal note. If you find any errors or have any feedback, I would appreciate your input.
Thank you for reading until the end.
コメント
コメントを投稿