EN VI

R - Best way to find the path of an element in nested list?

2024-03-17 19:30:05
How to R - Best way to find the path of an element in nested list

What is the best way to find the path of an element in a nested list? Or at least a good way, that doesn't involve manually digging through a list in View. Here is an example that I can already deal with:

l1 <- list(x = list(a = "no_match", b = "test_noname", c ="test_noname"),
           y = list(a = "test_name"))

After looking for an off-the-shelf solution in other packages, I found this approach (strongly inspired by rlist::list.search):

list_search <- function(l, f) {
  ulist <- unlist(l, recursive = TRUE, use.names = TRUE)
  match <- f(ulist)
  ulist[match]
}
list_search(l1, f = \(x) x == "test_noname")
          x.b           x.c 
"test_noname" "test_noname" 

This works pretty well as, it’s easy to understand that the name “x.b” here can be traslated for access like this:

l1[["x"]][["b"]]
[1] "test_noname"
# Or
purrr::pluck(l1, "x", "b")
[1] "test_noname"

And I can get all elements on the same level, by leaving out the last level index:

l1[["x"]]
$a
[1] "no_match"

$b
[1] "test_noname"

$c
[1] "test_noname"

This is usually my goal, as I know the values/name of one of the elements I want to get to and other similar elements are placed on the same sub-level (or sub-sub-sub-sub-sub-sub-sub-level).

However, many json files on the internet are not quite meant for easy consumption and parse into much more complicated lists, that look more like this:

l2 <- list(x = list("no_match", list("test_noname1", "test_noname2")), y = list(a = "test_name"))
str(l2)
List of 2
 $ x:List of 2
  ..$ : chr "no_match"
  ..$ :List of 2
  .. ..$ : chr "test_noname1"
  .. ..$ : chr "test_noname2"
 $ y:List of 1
  ..$ a: chr "test_name"
list_search(l2, f = \(x) x == "test_noname1")
            x2 
"test_noname1" 

From the resulting names, I would guess that the element “x2” can be accessed like that:

l2[["x2"]]
NULL
# or maybe
l2[["x"]][[2]]
[[1]]
[1] "test_noname1"

[[2]]
[1] "test_noname2"

But to not also rake in “test_noname2” here, I actually need something like this:

l2[["x"]][[2]][[1]]
[1] "test_noname1"

And then generalise it to this to get other interesting matches:

l2[["x"]][[2]]
[[1]]
[1] "test_noname1"

[[2]]
[1] "test_noname2"

So the issue are essentially unnamed elements in the list, that are not assigned names which are easy to generalise by unlist or rapply for that matter. Ideally there would be an automated way to translate these into a pluck call.

Solution:

If the question is how to get the path given the contents of a cell then using rrapply from the package of the same name

library(rrapply)

ix <- rrapply(l2, 
  condition = \(x) x == "test_noname1",
  f = \(x, .xpos) .xpos,
  how = "flatten")

unlist(ix)
## 11 12 13 
##  1  2  1 

l2[[unlist(ix)]]
## [1] "test_noname1"

library(purrr)
pluck(l2, !!!unlist(ix))
## [1] "test_noname1"

Note

Input from question

l2 <- list(x = list("no_match", list("test_noname1", "test_noname2")),
           y = list(a = "test_name"))
Answer

Login


Forgot Your Password?

Create Account


Lost your password? Please enter your email address. You will receive a link to create a new password.

Reset Password

Back to login