Solr join query

By using our site, you acknowledge that you have read and understand our Cookie PolicyPrivacy Policyand our Terms of Service. Stack Overflow for Teams is a private, secure spot for you and your coworkers to find and share information. I need to run a JOIN query on a solr index. I've got two xmls that I have indexed, person.

Webinar: Solr 6 Deep Dive: SQL and Graph

I need to only display information from the person doc but each query should match fields in both person and subject. In the case the query matches only the subject doc I need to display all docs from the person that have a matching id. Is this possible to do without running two seperate queries?

Something like a JOIN query would do the job. One thing that you should keep in mind is to always think of Solr indexes as single denormalized tables. This is sometimes a challenge and there may be times where you must be forced to use different indexes for each kind of data. This will have pretty poor performance if there are a large number of id's, but should be fine if there are just a few matching ids.

If you anticipate that your "join" queries will generally match a lot say hundreds of subjects, then you're probably better off denormalizing as suggested.

Learn more. Asked 10 years, 6 months ago. Active 9 years, 9 months ago. Viewed 10k times. Any help?

Apache Solr and Joins

Sfairas Sfairas 4 4 gold badges 11 11 silver badges 20 20 bronze badges. Active Oldest Votes. I do not think it is possible to do what you are asking with a single query using your schema. Pascal Dimassimo Pascal Dimassimo 6, 1 1 gold badge 33 33 silver badges 33 33 bronze badges.

Thanks very much Pascal. I don't know about changing the schema really. We've got some quite big XML files to index about 4 each one with it's own schema having IDs that connect one another.One reason for using nested documents is to prevent false matches. But if we represented the SKUs as two different documents, then there would be no incorrect match.

All children of a parent document must be indexed together with the parent document.

solr join query

One cannot update any document parent or child individually. The entire block needs to be re-indexed of any changes need to be made. Any document can have nested child documents. The locality of children and parents can be used to both speed up query operations and lower memory requirements compared to other join methods. Also see the Solr ref guide entry on the [child] doc transformer. Since our root implicit facet bucket formed by the query and filters consists of parent documents bookswe need to switch the facet domain to the children for the author facet.

By default, blockChildren will match all children of every parent doc from the input domain. The easiest way to limit children is with the filter clause. Note that regardless of which direction we are mapping parents to children or children to parentsor what documents we are operating on, we provide a parent filter to define the complete set of parents in the index.In addition to the main query parsers discussed earlier, there are several other query parsers that can be used instead of or in conjunction with the main parsers for specific purposes.

Many of these parsers are expressed the same way as Local Parameters in Queries. There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been indexed as nested documents. The example usage of the query parsers below assumes these two documents and each of their child documents have been indexed:. The parameter allParents is a filter that matches only parent documents ; here you would define the field and value that you used to identify all parent documents.

The parameter someParents identifies a query that will match some of the parent documents. The output is the children.

We only get one document in response:. Note that the query for someParents should match only parent documents passed by allParents or you may get an exception:. The parameter someChildren is a query that matches some or all of the child documents. Note that the query for someChildren should match only child documents or you may get an exception:.

We get this document in response:. A common mistake is to try to filter parents with a which filter, as in this bad example:.

You can optionally use the score local parameter to return scores of the subordinate query. The values to use for this parameter define the type of aggregation, which are avg averagemax maximummin minimumtotal sum.

Implicit default is none which returns 0. The main value is the query to be boosted. Parameter b is the function query to use as the boost. The query to be boosted may be of any type. Creates a query "foo" which is boosted scores are multiplied by the function query log popularity :. Creates a query "foo" which is boosted by the date boosting function referenced in ReciprocalFloatFunction :. This parser collapses the result set to a single document per group before it forwards the result set to the rest of the search components.

So all downstream components faceting, highlighting, etc. Under the covers, this query parser makes use of the Span group of queries, e. Set to true to force phrase queries to match terms in the order specified.

Default: true. Performance is sensitive to the number of unique terms that are associated with a pattern. It may be prudent to restrict wildcards to at least two or preferably three letters as a prefix.

Allowing very short prefixes may result in to many low-quality documents being returned. Applying ReversedWildcardFilterFactory in index-time analysis is usually a good idea. You may need to increase MaxBooleanClauses in solrconfig. This property is described in more detail in the section Query Sizing and Warming. Lets say we add the terms theupand to to stopwords. The next query that does use the Complex Phrase Query Parser, as in this query:. If you must remove stopwords for your use case, use a custom filter factory or perhaps a customized synonyms filter that reduces given stopwords to some impossible token.

Special care has to be given when escaping: clauses between double quotes usually whole query is parsed twice, these parts have to be escaped as twice. The FieldQParser extends the QParserPlugin and creates a field query from the input value, applying text analysis and constructing a phrase query if appropriate.Most of the technical details are covered in this talk by Martijn van Groningen.

I have a single-segment 55 GBindex with 27 M docs -- about a million parent documents, each with five children. I used SolrMeter with a slightly modified RandomExecutorwhich tries to keep a specified rate of queries per time period. It also provides several useful statistics and charts. In addition, I attached iostat traces to show system load during tests. Query Result Cache and Filter Cache have been disabled. Document Cache is enabled and shows a hit ratio of about 0.

See more about these Solr bolts and nuts.

Other Parsers

You can see that Join almost never ran for less than a second, and the CPU saturated with requests per minute. Adding more queries harmed latency. All index was cached in RAM via memory mapped files magic. I used Sen for the same queries with blockjoin. You see it! Search now takes only a few tens of milliseconds and survives with 6K requests per minute qps.

solr join query

And you see plenty of free CPU! We can check where Join uses so much CPU power with jstack :. How could that be? I ran two tests to understand how cache index files impact performance. You should know that not all files in your index are equally valuable. In other words, tune your schema wisely. In my index the frq file is 7.We recently had a client who wanted some up-front sense of how Apache Solr provided join support and, if so, how it performed. Naturally, the client wanted to use a join query in the most painful way, so I set out to make a prototype.

Of course I ran into some issues, but one of the delights of working for Lucidworks is that I have ready access to many of the people who wrote the code, something to treasure! Being able to access these folks makes me look waaaay smarter than I am….

Anyway, on my MacBook Pro I ran some rather unscientific experiments, but enough to give me a sense of how join query performs in one particular case. For this experiment, I created an index consisting of 26M documents. They were divided up into groups, one text document and 5 metadata documents. The metadata documents also had an integer field in the rangeThe whole purpose of this setup was to form queries that returned the text docs for which a metadata doc existed granting access.

The complexity of granting access is…er…low, I just did a range query. I could configure the number of simultaneous threads firing off queries. Note that I was testing this form because it applied to the customer, but I suspect that the other forms have the same issue.

As I mentioned, one of the pleasures of working for Lucid is having access to people who deeply understand the code. So I chatted with the join author Yonik Seeley and discovered, of course, that the scenario I was testing was the worst performance wise. So these results are worst-case. A note about these rather counter-intuitive numbers. On a dual-core machine, we see that with 2 threads. The 5 and 10 thread client rows simply show that each individual request takes longer, end-to-end, but there are more queries being served by Solr simultaneously.

When I took the join part out, performance went up about 15x. I was monitoring the CPU, and it was pegged with 2 threads, which makes sense. These numbers, assuming that they are representative of your particular situation could well be killers. On the other hand, they may be fine if your particular situation is serving a small community of users for whom the time spent waiting for a query to return is well-spent.

It might also mean that the case that Solr join functionality was meant to solve takes an unnecessarily restrictive approach for this particular problem.A free add-on for web browser Firefox, Split Panel lets you split browser windows in half, so you can see two web pages at once. This way, you can copy and paste answers at speed. Installing it is simple. If you don't already have Firefox, head to mozilla. Then go to Split Panel and hit the 'add to Firefox' button on the left.

Follow the prompts and restart Firefox. Clever auto-fill tools for speedy entries Next, take a technological grip so you can enter more contests, but spend far less time doing it. Many top compers enter 100 online competitions a day by exploiting tools and tricks to turbo-charge their competition-entering.

solr join query

Then when you want to fill in a form, just highlight the info and copy and paste it into the boxes. To take it up a notch, most web browsers, such as Internet Explorer or Firefox, feature an option to remember your details and fill them in automatically. So next time you start to enter the same thing, type the first few letters and suggestions should appear.

This can be a security risk, so avoid on shared computers. Open Chrome, then at the top right choose 'Settings' from the drop-down menu. At the bottom of the screen click on 'Advanced'.

Under 'Passwords and forms', select ' Autofill settings'.

solr join query

Google Chrome has a guide to its automatic form-filling feature. Click on 'Preferences', which is located in the top right hand corner of your screen near the Apple icon, then click on 'Auto-fill'. To turn AutoFill off or on, select the information you want to include in AutoFill and deselect the rest.

This doesn't always fill out forms correctly first time, so try downloading the Autofill Forms add-on, which is more sophisticated. Go to Tools, then Internet Options, click the Content tab and then AutoComplete 'Settings'. Put a tick in the box to select AutoComplete for forms. While you can set browsers to remember basic information such as names and addresses, they aren't that intelligent and often put the wrong info in fields.

Free web program Roboform is a form-filling weapon that easily outguns typing alone. It stores info such as addresses, phone numbers and postcodes and uses them to automatically fill in online forms. The details are stored on your computer. All you need to remember is the crucial master-password to access them. One consideration is Roboform blindly fills out form details in the same way as spambots, which spammers use to send scores of entries.

So some firms may mistake you for a spambot and block your entry. We've no stats on how many competitions will block you this way, but if you're worried, try AutoHotKey instead. How to set up RoboformGo to Roboform and download the program. Once the software is installed, find the Roboform icon on your browser toolbar and click 'Identities' to create a new ID. Then simply fill in all the details you want it to remember about you, eg, name, address, postcode, date of birth.

When you see a form you want to fill in, click the Roboform icon and select your identity. This will magically fill in the blanks with your details (do double-check though). Roboform can also remember user IDs and passwords. You can automate that phrase using a bit of free software called AutoHotKey.

High-performance Join in Solr with BlockJoinQuery

Then all you have to do every time you want to enter, say, the first line of your address, is press 'Alt' and '4', and like magic the words '29 Acacia Road' appear in the form.Booking in winter allowed for an extremely affordable trip. We appreciated Nordic Visitor's eagerness to provide any assistance necessary and interest in our experience. We don't care about visiting the rest of the world any more, nothing beats Iceland.

From booking the holiday, being met at the airport, exellent hotel and accomodation, the friendliness of all the people we met, exellent food and stunning country I can only say we are already looking forward to a return trip.

I booked a Nordic Visitor Tour for my partner's 40th Birthday just after Christmas and it was fantastic, we had such a great time. Nordic Visitor's staff were really helpful, there are really well organised. We received all the documentation we needed when we arrived at the hotel including a map, a guide tour as well as a mobile for ermergencies.

The hotel we stayed in was very nice and very well located in the city centre although it was quite noisy with some bar or night club next to the hotel which kept us awake. But that's the only complaint we would have as everything else was great. The holiday cottage was fantastic, lost in the wilderness, it was like staying in a ski chalet with its own hot tub outside.

We rented a four wheel drive which was part of the package and here again no problem at all, the car was in prestine condition and I would definitely recommend a four wheel drive if you want to travel a little bit further as some of the roads can be challenging.

The highlight of our trip was spending New Year's Eve in Reykjavik. It was out of this world, an explosion of fireworks for at least two hours. It was probably one of the bestif not the best New Year's Eve we ever had. If by chance you are going at this time of year I would recommend watching the fireworks from outside the Cathedral, but beware if you have children as in Iceland there are no restrictions on fireworks and anyone can fire them from pretty much anywhere and we had a few passing by dangerously near.

Nested Objects in Solr

But do not let this putting you off as there were fantastic firework displays. So overall, I would definitely recommend Nordic Visitor as a tour company as well as the Northern Comfort self-drive package. We are really hoping we can come back one day and see the Northern Lights as unfortunately we were not lucky that time.

Our first experience with Nordic Visitor was in 2010 when we did a 14 day driving tour of Norway. We had an amazing time and were very grateful to the wonderful folk who organised it, we felt well supported though we had no problems. We have just had our second experience with Nordic Visitor and this time we really got to know just how well supported we really were.

thoughts on “Solr join query”

Leave a Reply

Your email address will not be published. Required fields are marked *