Foros: general discussion (Thread #48996)

Index not being used (2023-04-05 02:16 by kishima #94871)

Hello,

I am creating an index on a table with a column that might have very large text (over 3k characters).
```
CREATE TABLE big_better_text (
id SERIAL PRIMARY KEY,
text TEXT
);

CREATE INDEX idx_big_text ON big_better_text USING gin(text gin_bigm_ops);
```
I wanted to use this extension because regular indexes are not used in this scenario, and I thought this extension would enable this behavior. Did I understand it correctly?
I want to do an exact string match on this column.

Thanks in advance!

Responder al #94871×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Entrar

Re: Index not being used (2023-04-06 02:05 by fujii_masao #94893)

メッセージ #94871 への返信
> Hello,
>
> I am creating an index on a table with a column that might have very large text (over 3k characters).
> ```
> CREATE TABLE big_better_text (
> id SERIAL PRIMARY KEY,
> text TEXT
> );
>
> CREATE INDEX idx_big_text ON big_better_text USING gin(text gin_bigm_ops);
> ```
> I wanted to use this extension because regular indexes are not used in this scenario, and I thought this extension would enable this behavior. Did I understand it correctly?
> I want to do an exact string match on this column.

Yes, you can use pg_bigm to create a bi-gram index on a column with large text data. This allows the index to be used by a LIKE query such as the one below.

SELECT * FROM big_better_text WHERE text LIKE '...';
Responder al #94871

Responder al #94893×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Entrar

Re: Index not being used (2023-04-06 04:03 by kishima #94895)

Reply To Message #94893
> メッセージ #94871 への返信
> > Hello,
> >
> > I am creating an index on a table with a column that might have very large text (over 3k characters).
> > ```
> > CREATE TABLE big_better_text (
> > id SERIAL PRIMARY KEY,
> > text TEXT
> > );
> >
> > CREATE INDEX idx_big_text ON big_better_text USING gin(text gin_bigm_ops);
> > ```
> > I wanted to use this extension because regular indexes are not used in this scenario, and I thought this extension would enable this behavior. Did I understand it correctly?
> > I want to do an exact string match on this column.
>
> Yes, you can use pg_bigm to create a bi-gram index on a column with large text data. This allows the index to be used by a LIKE query such as the one below.
>
> SELECT * FROM big_better_text WHERE text LIKE '...';

Would a query in the form

select big_better_text.text, count(*) as ct from big_better_text group by big_better_text.text order by ct desc;

be benefited from it? Tbh this is kind of a large scan, so maybe there's no good index for this. But if I specify some texts for identical string match, it indeed benefits the query.
Responder al #94893

Responder al #94895×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Entrar

Re: Index not being used (2023-04-10 20:20 by fujii_masao #94950)

メッセージ #94895 への返信
> Would a query in the form
>
> select big_better_text.text, count(*) as ct from big_better_text group by big_better_text.text order by ct desc;
>
> be benefited from it? Tbh this is kind of a large scan, so maybe there's no good index for this. But if I specify some texts for identical string match, it indeed benefits the query.

I'm sorry to inform you that the pg_bigm GIN index cannot be used for GROUP BY queries.

You could try creating a btree index using a function on the column, e.g., CREATE INDEX testidx ON big_better_text (md5(text)), and then issue a query like "SELECT big_better_text.text, count(*) as ct FROM big_better_text GROUP BY md5(text), text ORDER BY ct DESC". This might allow you to use the index for GROUP BY on the large text column. However, it's not clear if using the index will provide better performance than a sequential scan.

Regards,
Responder al #94895

Responder al #94950×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Entrar

Re: Index not being used (2023-04-06 04:04 by kishima #94896)

Reply To Message #94893
> メッセージ #94871 への返信
> > Hello,
> >
> > I am creating an index on a table with a column that might have very large text (over 3k characters).
> > ```
> > CREATE TABLE big_better_text (
> > id SERIAL PRIMARY KEY,
> > text TEXT
> > );
> >
> > CREATE INDEX idx_big_text ON big_better_text USING gin(text gin_bigm_ops);
> > ```
> > I wanted to use this extension because regular indexes are not used in this scenario, and I thought this extension would enable this behavior. Did I understand it correctly?
> > I want to do an exact string match on this column.

Thanks for the answer!
>
> Yes, you can use pg_bigm to create a bi-gram index on a column with large text data. This allows the index to be used by a LIKE query such as the one below.
>
> SELECT * FROM big_better_text WHERE text LIKE '...';
Responder al #94893

Responder al #94896×

You can not use Wiki syntax
You are not logged in. To discriminate your posts from the rest, you need to pick a nickname. (The uniqueness of nickname is not reserved. It is possible that someone else could use the exactly same nickname. If you want assurance of your identity, you are recommended to login before posting.) Entrar