hadoop – How to use aggregate functions in Hive on Group by columns

hadoop – How to use aggregate functions in Hive on Group by columns

When you apply a function to a column, it is not longer called the same thing. You should name it explicitly using the as keyword.

select group1, group2 from xyz group by my_func(col1) as group1, col2 as group2;

Also, if youre only selecting the columns that youre grouping by, not the actual grouped data, maybe distinct would be more appropriate than group by?

The call to the aggregate function is in the wrong place. It should be made as follows:

Select my_func(col1),col2 from xyz group by col1,col2

hadoop – How to use aggregate functions in Hive on Group by columns

select col1, col2 from xyz group by my_func(col1) as col1, col2 

The basic is that your GROUP BY needs to have all the cols that you have mentioned in SELECT clause.

Related Posts

Leave a Reply

Your email address will not be published.