继上一篇文章《用SolrJ操作Solr做搜索》
》介绍了单机版Solr的部署、安装、分页查询数据、高亮、文字补全、智能提示、拼音补全等功能后,本篇文章主要介绍分布式Solr(SolrCloud)的搭建、Spring->
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/tianduoduo/zk/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=192.168.140.133:2888:3888
server.2=192.168.140.135:2888:3888
server.3=192.168.140.134:2888:3888
每台机器新建myid文件,分别为1,2,3。通过命令bin/zkServer.sh start 分别启动每台服务器。
[root@localhost solr-5.0.0]# bin/solr start -c -z 192.168.140.133:2181,192.168.140.134:2181,192.168.140.135:2181
Waiting to see Solr listening on port 8983 [\]
Started Solr server on port 8983 (pid=9219). Happy searching!
# -c 以SolrCloud模式启动 -z ZooKeeper链接
[root@localhost solr-5.0.0]# bin/solr create -c commenta_admin -d data_driven_schema_configs -shards 3 -replicationFactor 2
Connecting to ZooKeeper at 192.168.140.133:2181,192.168.140.134:2181,192.168.140.135:2181
Uploading /home/tianduoduo/solr/solr-5.0.0/server/solr/configsets/data_driven_schema_configs/conf for config commenta_admin to ZooKeeper at 192.168.140.133:2181,192.168.140.134:2181,192.168.140.135:2181
Creating new collection 'commenta_admin' using command:
http://192.168.140.135:8983/solr/admin/collections?action=CREATE&name=commenta_admin&numShards=3&replicationFactor=2&maxShardsPerNode=2&collection.configName=commenta_admin
{
"responseHeader":{
"status":0,
"QTime":11866},
"success":{"":{
"responseHeader":{
"status":0,
"QTime":11423},
"core":"commenta_admin_shard3_replica1"}}}
# -c Collection名字 -d 加载配置文件 -shards 分片数量 -replicationFactor 复制数量
192.168.140.133
commenta_admin_shard1_replica1
├── core.properties
└── data
├── index
│ ├── segments_1
│ └── write.lock
└── tlog
commenta_admin_shard2_replica2
├── core.properties
└── data
├── index
│ ├── segments_1
│ └── write.lock
└── tlog
192.168.140.134
commenta_admin_shard1_replica2
├── core.properties
└── data
├── index
│ ├── segments_1
│ └── write.lock
└── tlog
commenta_admin_shard3_replica1
├── core.properties
└── data
├── index
│ ├── segments_1
│ └── write.lock
└── tlog
192.168.140.135
commenta_admin_shard2_replica1
├── core.properties
└── data
├── index
│ ├── segments_1
│ └── write.lock
└── tlog
commenta_admin_shard3_replica2
├── core.properties
└── data
├── index
│ ├── segments_1
│ └── write.lock
└── tlog
$ sh zkcli.sh -cmd upconfig -zkhost <host:port> -confname <name for configset>-solrhome <solrhome> -confdir <path to directory with configset>
http://192.168.140.130:8983/solr/admin/action=RELOAD&name=commenta
SolrCloudServerFactoryBean
public class SolrCloudServerFactoryBean implements FactoryBean, InitializingBean, DisposableBean{
/* Solr集群客户端 */
private CloudSolrClient cloudSolrClient;
/* zookeeper集群host */
private String zkHost;
/* 默认Solr Colletion */
private String defaultCollection;
/* Solr集群最大连接数 */
private int maxConnections = 1000;
/* 每个host最大连接数 */
private int maxConnectionsPerHost = 500;
/* zookeeper客户端超时时间 */
private int zkClientTimeout = 10000;
/* zookeeper客户端连接超时时间 */
private int zkConnectTimeout = 10000;
private Lock lock = new ReentrantLock();
@Override
public void afterPropertiesSet() throws Exception {
ModifiableSolrParams params = new ModifiableSolrParams();
params.set(HttpClientUtil.PROP_MAX_CONNECTIONS, maxConnections);
params.set(HttpClientUtil.PROP_MAX_CONNECTIONS_PER_HOST, maxConnectionsPerHost);
HttpClient client = HttpClientUtil.createClient(params);
LBHttpSolrClient lbClient = new LBHttpSolrClient(client);
lock.lock();
try {
if (cloudSolrClient == null) {
cloudSolrClient = new CloudSolrClient(zkHost, lbClient);
}
} finally {
lock.unlock();
}
cloudSolrClient.setDefaultCollection(defaultCollection);
cloudSolrClient.setZkClientTimeout(zkClientTimeout);
cloudSolrClient.setZkConnectTimeout(zkConnectTimeout);
}
}
CommentaSolrCloudService
public class CommentaSolrCloudService {
private CloudSolrClient cloudSolrClient;
public List queryCommodityData(Integer pageNo, Integer pageSize){
SolrQuery solrQuery = new SolrQuery();
if(pageNo.intValue() == DEFAULT_LIMIT && pageSize.intValue() == DEFAULT_LIMIT){
solrQuery.setQuery("*:*");
solrQuery.setStart(0);
solrQuery.setRows(Integer.MAX_VALUE);
}else {
int start = (pageNo - 1) * pageSize;
solrQuery.setQuery("*:*");
solrQuery.setStart(start);
solrQuery.setRows(pageSize);
}
try {
QueryResponse queryResponse = cloudSolrClient.query(solrQuery);
return queryResponse.getBeans(CommentaCommodity.class);
} catch (Exception e) {
logger.error(e.getMessage(), e);
return null;
}
}
}
@Test
public void testqueryCommodityData(){
List<CommentaCommodity> list = commentaSolrCloudService.queryCommodityData(1, 5);
System.out.println(JSON.toJSONString(list));
}
Spring-Data-Solr操作数据:
CommoditySolrRepository
public interface CommoditySolrRepository{
Page findCommodities(String value, Pageable page);
}
CommoditySolrRepositoryImpl
public class CommoditySolrRepositoryImpl implements CommoditySolrRepository{
private SolrOperations solrTemplate;
public CommoditySolrRepositoryImpl() {
super();
}
public CommoditySolrRepositoryImpl(SolrOperations solrTemplate) {
super();
this.solrTemplate = solrTemplate;
}
@Override
public Page findCommodities(String value, Pageable page) {
return solrTemplate.queryForPage(new SimpleQuery(new SimpleStringCriteria("name:" + "*" + value+ "*" )).setPageRequest(page),
CommentaCommodity.class);
}
}
@Test
public void testfindCommodities(){
String value = "联想";
PageRequest page = new PageRequest(1, 100); //page从0开始
Page<CommentaCommodity> resultPage = commoditySolrRepository.findCommodities(value, page);
System.out.println(JSON.toJSONString(resultPage.getContent()));
}
DerivedCommoditySolrRepository
public interface DerivedCommoditySolrRepository extends SolrCrudRepository {
@Query(CommentaCommodity.TYPE + ":0")
Page findByTypeUsingAnnotatedQuery(Pageable page); #分页查找
@Highlight
HighlightPage findByNameIn(Collection name, Pageable page);#分页高亮查找
@Query(value = "count:[ 1 TO 10 ]")
@Facet(fields = { "name" }, limit = 20)
FacetPage findByNameAndFacetOnCount(String name, Pageable page);#分组聚类数据
Page findByNameOrType(@Boost(2) String name, String type, Pageable page);#设置权重
}
@Test
public void testAnnotationQuery(){
DerivedCommoditySolrRepository repo= new SolrRepositoryFactory(solrTemplate).getRepository(DerivedCommoditySolrRepository.class,new CommoditySolrRepositoryImpl(solrTemplate));
Page resultPage = repo.findByTypeUsingAnnotatedQuery(new PageRequest(0, 10));
System.out.println(JSON.toJSONString(resultPage.getTotalElements()));
}
参数 | 描述 |
---|---|
defType | 指定用于处理查询语句(参数q的内容)的查询解析器,eg:defType=lucene |
sort | 指定响应的排序方式:升序asc或降序desc.同时需要指定按哪个字段进行排序。eg: sort=price desc,score asc |
start | 指定显示查询结果的开始位置,默认是0 |
rows | 指定一次显示多少行查询结果,默认是10 |
fq | 指定用于对查询结果进行过滤的过滤器(也看作是一种query) eg: fq=price:[100 To *]&fq=setction:0 |
fl | 指定查询结果中返回的字段,该字段的stored=”true”或docValues=”true” ,eg:fl=id,title,product(price, popularity) |
debug | 指定查询结果中携带额外的调试信息:时间信息debug=timing,“explain”信息debug=results,所有调试信息debug=query |
q | 使用标准查询语法定义的查询语句,必填 |
timeAllowed | 指定查询处理的时间,单位毫秒。如果查询在指定的时间未完成,则只返回部分信息 |
q.op | 查询表达式的默认操作符,取值AND或者OR |
omitHeader | 当设为true时,返回结果不包含头部信息(例如请求花费的时间等信息),默认是false |
wt | 执行响应的输出格式:xml或json等 |
logParamsList | 指定哪些参数需要写入log, eg:logParamsList=q,fq |
df | 默认查找的字段 |
参数 | 描述 |
---|---|
: | 指定要查找的字段,比如:title:“The Right Way” AND text:go |
? | 匹配单一字符,比如:te?t匹配test/text |
* | 匹配0或多个字符,比如:tes*匹配test/testing/tester |
~ | 基于编辑距离的模糊查询,比如:roam~匹配roams/foam/foams/roam. roam~1(指定距离必须是1)匹配roams/foam,但不会匹配foams |
~n | 邻近查询,查找相隔一定距离的单词,比如:”jakarta apache”~10(相隔10个单词) |
to | 范围查询,{}不包含边界,[]包含边界,比如:title:{Aida TO Carmen} |
^ | 加权因子,比如:jakarta^4 apache 查找结果中jakarta更相关 |
^= | 指定查询语句的score得分为常量,比如:(description:blue OR color:blue)^=1.0 text:shoes |
AND(&&) | 运算符两边的查询词同时出现 比如:”jakarta apache” AND “Apache Lucene” |
OR | 运算符两边的查询词至少一个出现, 默认运算符,比如 “jakarta apache” jakarta 等价于 “jakarta apache” OR jakarta |
NOT(!) | 运算符后面的查询词不出现,比如”jakarta apache” NOT “Apache Lucene” |
+ | 运算符后面的查询词出现(known as the “required” operator),比如+jakarta lucene查询必须包含jakarta,而lucene可以出现可不出现 |
- | 不能包含运算符后面的查询词 “jakarta apache” -“Apache Lucene” |
[] | 包含范围边界 |
{} | 不包含范围边界 |
/**
* 通过Solr API提供的Facet机制提供一种类似于分组查找的效果。
* @see Solr Facet介绍
* 目前Facet机制提供了很多可选参数,如:
* 1.facet.field 分组的字段(必须是索引字段)
* 2.facet.prefix 表示Facet字段前缀
* 3.facet.limit Facet字段返回条数
* 4.facet.offict 开始条数,偏移量,它与facet.limit配合使用可以达到分页的效果
* 5.facet.mincount Facet字段最小count,默认为0
* 6.facet.missing 如果为on或true,那么将统计那些Facet字段值为null的记录
* 7.facet.method 取值为enum或fc,默认为fc, fc表示Field Cache
* 8.facet.enum.cache.minDf 当facet.method=enum时,参数起作用,文档内出现某个关键字的最少次数
* @param fileds
*/
public List qyeryDataByFacetParam(String queryStr,String... fields){
SolrQuery solrQuery = new SolrQuery();
List list = new ArrayList();
try {
solrQuery.setFacet(true);//设置facet=on
solrQuery.addFacetField(fields);
solrQuery.setFacetLimit(DEFAULT_FACET_LIMIT);
if(StringUtils.isEmpty(queryStr)){
solrQuery.setQuery("*:*");
}else{
solrQuery.setQuery(queryStr);
}
QueryResponse response = cloudSolrClient.query(solrQuery);
List facets = response.getFacetFields();//返回的facet列表
for (FacetField facetField : facets) {
List count_list = facetField.getValues();
for (Count count_info : count_list) {
list.add(new FacetInfo(count_info.getName(), count_info.getCount()));
}
}
return list;
} catch (Exception e) {
logger.error(e.getMessage(), e);
return null;
}
}
/**
* 自定义打分规则查询数据
* @param pageNo
* @param pageSize
* @param bfValue
* @param queryStr
* @return
*/
public List queryCommodityDataByBoost(String bfValue, Integer pageNo, Integer pageSize, String queryStr){
SolrQuery solrQuery = new SolrQuery();
if(pageNo.intValue() == DEFAULT_LIMIT && pageSize.intValue() == DEFAULT_LIMIT){
solrQuery.setQuery(queryStr);
solrQuery.setStart(0);
solrQuery.setRows(Integer.MAX_VALUE);
}else {
int start = (pageNo - 1) * pageSize;
solrQuery.setQuery(queryStr);
solrQuery.setStart(start);
solrQuery.setRows(pageSize);
}
solrQuery.set("defType","edismax");
solrQuery.set("pf", "name^100");
solrQuery.set("bf", bfValue);
try {
QueryResponse queryResponse = cloudSolrClient.query(solrQuery);
return queryResponse.getBeans(CommentaCommodity.class);
} catch (Exception e) {
logger.error(e.getMessage(), e);
return null;
}
}
System.out.println(JSON.toJSONString(commentaSolrCloudService.queryCommodityDataByBoost("sum(recip(_version_,3.16e-11,1,1),div(1000,count))^100", 1, 10,"id:79* AND name:*联想*")));
本文来自网易实践者社区,经作者田躲躲授权发布。