投稿

10月, 2024の投稿を表示しています

I tried out Granite 3.0

イメージ
I tried out Granite 3.0 Granite 3.0 Granite 3.0 is an open-source, lightweight family of generative language models designed for a range of enterprise-level tasks. It natively supports multi-language functionality, coding, reasoning, and tool usage, making it suitable for enterprise environments. I tested running this model to see what tasks it can handle. Environment Setup I set up the Granite 3.0 environment in Google Colab and installed the necessary libraries using the following commands: ! pip install torch torchvision torchaudio ! pip install accelerate ! pip install -U transformers Execution I tested the performance of both the 2B and 8B models of Granite 3.0. 2B Model I ran the 2B model . Here’s the code sample for the 2B model: import torch from transformers import AutoModelForCausalLM , AutoTokenizer device = "auto" model_path = "ibm-granite/granite-3.0-2b-instruct" tokenizer = AutoTokenizer . from_pre

Janus 1.3B: A Unified Model for Multimodal Understanding and Generation Tasks

イメージ
Janus 1.3B: A Unified Model for Multimodal Understanding and Generation Tasks Janus 1.3B Janus is a new autoregressive framework that integrates multimodal understanding and generation. Unlike previous models, which used a single visual encoder for both understanding and generation tasks, Janus introduces two separate visual encoding pathways for these functions. Differences in Encoding for Understanding and Generation In multimodal understanding tasks, the visual encoder extracts high-level semantic information such as object categories and visual attributes. This encoder focuses on inferring complex meanings, emphasizing higher-dimensional semantic elements. On the other hand, in visual generation tasks, emphasis is placed on generating fine details and maintaining overall consistency. As a result, lower-dimensional encoding that can capture spatial structures and textures is required. Setting Up the Environment Here are the steps to run Janus

Entropix: Sampling Techniques for Maximizing Inference Performance

イメージ
Entropix: Sampling Techniques for Maximizing Inference Performance Entropix: Sampling Techniques for Maximizing Inference Performance According to the Entropix README , Entropix uses an entropy-based sampling method. This article explains the specific sampling techniques based on entropy and varentropy. Entropy and Varentropy Let’s start by explaining entropy and varentropy, as these are key factors in determining the sampling strategy. Entropy In information theory, entropy is a measure of the uncertainty of a random variable. The entropy of a random variable ( X ) is defined by the following equation: ( X ): A discrete random variable. ( x_i ): The ( i )-th possible state of ( X ). ( p(x_i) ): The probability of state ( x_i ). Entropy is maximized when the probability distribution is uniform. Conversely, when a specific state is much more likely than others, entropy decreases. Varentropy Varentropy, closely related to entropy, represents t

Using WebSocket with Python

イメージ
Using WebSocket with Python What is WebSocket? WebSocket is a protocol that enables real-time, bidirectional communication between a browser and a server. Traditional HTTP communication involves the client sending a request and the server responding to exchange data. In contrast, with WebSocket, once the initial connection is established, both the client and the server can send and receive messages to each other without needing to repeatedly establish new connections. Recently, interactive services like the OpenAI Realtime API and Hume AI have become more common, leading to an anticipated increase in demand for WebSocket. This article introduces the basics of how to use WebSocket, along with a look into related asynchronous processing. Using WebSocket with Python In Python, you can use WebSocket as shown below: import asyncio import websockets uri = "ws://..." async def hello ( ) : async with websockets . connect ( uri ) a